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2015 

The most important problems for society are describable only in vague terms, dependent 
on subjective positions, and missing highly relevant data. This thesis is intended to 
revive and further develop the view that giving non-trivial, rigorous deductive arguments 
concerning such problems -without eliminating the complications of vagueness, subjectivity, 
and uncertainty- is, though very difficult, not problematic in principle, does not require 
the invention of new logics (classical first-order logic will do), and is something that 
more mathematically-inclined people should be pursuing. The framework of interpreted 
formal proofs is presented for formalizing and criticizing rigorous deductive arguments 
about vague, subjective, and uncertain issues, and its adequacy is supported largely by a 
number of major examples. This thesis also documents progress towards a web system 
for collaboratively authoring and criticizing such arguments, which is the ultimate goal of 
this project. 
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Chapter 1 
Introduction 


Gottfried Leibniz had a radical and idealistic dream, long before the formalization of 
predicate logic, that some day the rigor of mathematics would find much broader use. 

For men can be debased by all other gifts; only right reason can be nothing but 
wholesome. But reason will be right beyond all doubt only when it is everywhere 
as clear and certain as only arithmetic has been until now. Then there will 
be an end to that burdensome raising of objections by which one person now 
usually plagues another and which turns so many away from the desire to 
reason. When one person argues, namely, his opponent, instead of examining 
his argument, answers generally, thus, ‘How do you know that your reason 
is any truer than mine? What criterion of truth have you?’ And if the first 
person persists in his argument, his hearers lack the patience to examine it. 

For usually many other problems have to be investigated first, and this would 
be the work of several weeks, following the laws of thought accepted until now. 

And so after much agitation, the emotions usually win out instead of reason, 
and we end the controversy by cutting the Gordian knot rather than untying 
it. (Gottfried Leibniz, 1679, “On the General Characteristic”[LL7Q]) 

What is understood by many mathematically-inclined people -that formal logic is in 
principle applicable to arguments about social, contentious, emotionally charged issues- 
sounds absurd to most people, even the highly educated. The first, rather unambitious 
goal of this project, is to illustrate this understanding. The second goal, a very difficult 
and lonely one, is to investigate whether such use of rigorous deduction is worth doing, 
even if only in our spare time. 


vi 


There are thousands and thousands of pages by hundreds of scholars that are tan¬ 
gentially related to this project; papers about vagueness in the abstract, 1 the theoretical 
foundations of Bayesian reasoning, 2 abstract argumentation systems [PralO], etc. There is a 
huge amount of scholarly work on systems and tools and consideration of the theoretically- 
interesting corner cases, but too little serious work in which the problems take precedence 
over the tools used to work on them. This thesis concerns a project of the latter kind; 
the work is on important specific problems, attacking general theoretical problems only 
as-necessary. In this way, we avoid getting hung up on details that don’t matter. 

Surprisingly, it is the (normative side of the) field of Informal Logic that is probably 
most related to this project [Wal08][WK95]. For a long time now the researchers in that 
field have understood that dialogue-like interactions, or something similar, are essential 
for arguing about the problems we are concerned with here (Section 2.1). But formal logic 
has something to contribute here; there are too many examples where good, intelligent 
scientists and statisticians are given a voice on such problems, only to fail to adhere to 
the same standards of rigor that they follow in their professional work. 3 

There are important commonalities between proofs in mathematics and proofs about 
subjective and vague concepts. For example, in both domains, we only need to axiomatize 
the structures we are thinking about precisely enough for the proof to go through; our 
proofs about numbers and sets never 4 require complete characterizations, and similarly, for 
proofs about people, laws, moral values, etc, there is no need to fully eliminate the vague¬ 
ness that is inherit in axiomatizations with multiple distinct models. That observation is 
materialized in this project’s use of top-down, minimally-reductionist formal proofs -my 
name for formal proofs where one does not strive to minimize the number of fundamental 
(not defined) symbols, or the number of axioms (an assertion remains an assertion until 
someone demands it become a proved lemma). 5 I believe top-down, minimally-reductionist 
formal proofs are the only option when reasoning faithfully about vague concepts. 

x See [Sorl3], where the approach I take to reasoning in the presence of vagueness does not appear to 
be covered. I call my approach vagueness as plurality of intended models. 

2 1 recommend [Pea09]. 

3 [Ses07] provides a good example. There Sesardic, a philosopher, contradicts the hasty conclusions of 
some very reputable statisticians, essentially by applying the same Bayesian quantitative argument, but 
with much more care taken in constraining the values of the prior probabilities. 

4 Except for proofs about finite structures. 

5 As a non-essential demonstration of the concept of top-down, minimally-reductionist proofs, and of 
the dynamic HTML output I’ve developed for reading arguments, here are two examples that I wrote 
while debugging my current system. The first is fully formally verified by a first-order theorem prover. 
Infinitely-many primes: http://www.cs.toronto.edu/~wehr/thesis/infinitely-many_primes.html 
5 color theorem: http://www.cs.toronto.edu/~wehr/thesis/5colortheorem.html 
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There are three aspects of contentious socially-relevant questions that distinguish them 
from questions that are commonly considered mathematical * 6 : vagueness 7 , subjectiveness 8 , 
and uncertainty. None of these can be eliminated completely without changing the 
fundamental nature of the problems. 

With mathematics problems we can usually axiomatize structures sufficiently-precisely 
at the beginning of our attempt to resolve a problem (a statement to prove or disprove), 
whereas in reasoning about social issues one must delay the precisifying of vague definitions 
until necessary - in particular, until critics of one’s argument are too unclear about one’s 
informal semantics for a symbol to be able to evaluate whether they should accept or 
reject an axiom that depends on that symbol (this is called a semantics criticism in 
Section 2.2.1). Of course, questions about vague concepts cannot always be answered in a 
particular way. What may happen is that the question has different answers depending 
on how it is precisihed, which is determined by the author of an argument that purports 
to answer that question (and sometimes, indirectly, by the critics of the argument). An 
illustrative example of this can be made with Newcomb’s Paradox 9 ; for all of the many 
English presentation of the problem that I have seen, it is not hard to give two reasonable 
formalizations that yield opposite answers, a fact that has been ignored, downplayed, or 
overlooked by many commentators arguing that one of the answers is the right answer 
(and likewise for many puzzles or paradoxes argued about in analytic philosophy). 

As with vagueness, subjectiveness demands some system of interaction between people 
on the two sides of an argument, and I am working on an implementation of such a 
system now (Section 9.1). Of course I do not mean to suggest that formal logic can help 
two parties with conflicting beliefs come to the same answer on, say, questions of ethics. 
However, where formal logic can help is to find fundamental sources of disagreement 
starting from disagreement on some complex question (which is progress!). 

Uncertainty is the most difficult of the three complications. Sparsity of information 
can make it impossible to give an absolutely -strong deductive argument for or against 
a given proposition, and the inability to do so can easily deflate one’s motivation to 
make a formal demonstrations. But interaction is useful here, too: In Chapter 6, I give 
a proof that a key piece of evidence that was used to convict a man of murder has no 
inculpatory value. Now, I cannot say that the assumptions from which that conclusion 
(the proposition named (the newspaper hair evidence is neutral or exculpatory)) follows 

e But note that, as this thesis will make clear, my opinion is that there is no sharp qualitative boundary 

between the two domains. 

7 Classic examples are vague predicates expressing tallness, or baldness. 

8 E.g. the weights of various principles of morality 

9 Start at the Wikipedia page if you haven’t heard of this and are curious. 
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are absolutely easy to accept, but I confidently challenge anyone to come up with a proof of 
the negation of that conclusion, i.e. a proof that the likelihood of the convicted man being 
guilty given the evidence is significantly larger than the likelihood that he is innocent 
given the evidence. Hence, I am claiming that my assumptions are easy to accept relative 
to what my opponents could come up with. 

I use a superficial extension of classical FOL in this project, and for this particular 
project -rigorous deductive reasoning about the kind of questions described in Section 2.1- 
that seems to be the right fit. It is vital that the interface between syntax and semantics 
is as simple as possible, and classical FOL with Tarski semantics is the best in this respect. 
In Section 2.3 I make that argument in more detail. In Chapter 3, I take some space to 
explain how common forms of defeasible reasoning can be carried out in deductive logic, 
and how ideas from e.g. modal (or, by extension, epistemic or temporal) logic can be used 
as-necessary without needing to build them into the definition of the logic (complicating 
the semantics). 

1.1 What this project is and isn’t 

What it is: 

This thesis gives the foundations for, and documents progress toward, a collaborative 
web system intended for arguing about certain kinds of questions in the most rigorous, fair, 
and civil way that we know of: formal deductive proof. The main proposed uses/benefits 
of the system are: 

1. Making progress in arguments about questions usually considered outside of math 
and science. 

2. Helping people find the fundamental sources of their disagreements with each other 
(a special case of progress). 

3. Demonstrating deductive thinking to people who are not interested in mathematical 
problems, or are not mathematically inclined. 

4. Giving mathematicians an outlet for advocacy work that utilizes their technical 
abilities in an essential way. 10 

5. Serving as a practically-approachable ideal for rigorous, fair, and civil argumentation. 

The apparent contradiction between items 3 and 4 is resolved by noting that people who 
are mathematically inclined and people who are not have different roles in the system. In 

10 Of course statisticians have always engaged in advocacy work. By “mathematicians” here, I mean 
those in disciplines whose overwhelmingly central focus is on proving theorems. 
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particular, authoring a new argument requires at least one person who is familiar with basic 
formal logic, but contributing to new and existing arguments, or criticizing arguments, 
does not. Moreover, as a small number of excellent lawyers and philosophers have 
demonstrated over the centuries, writing natural language arguments that approximate 
the ideal (item 5) does not necessarily require familiarity with formal logic. I choose, 
perhaps over-generously, to interpret Leibniz’s (latest, and most pessimistic) writing 
describing his imagined universal characteristic, with those notes in mind: 

It is true that in the past I planned a new way of calculating suitable for 
matters which have nothing in common with mathematics, and if this kind of 
logic were put into practice, every reasoning, even probabilistic ones, would 
be like that of the mathematician: if need be, the lesser minds which had 
application and good will could, if not accompany the greatest minds, then 
at least follow them. For one could always say: let us calculate, and judge 
correctly through this, as much as the data and reason can provide us with the 
means for it. But I do not know if I will ever be in a position to carry out 
such a project, which requires more than one hand; and it even seems that 
mankind is still not mature enough to lay claim to the advantages which this 
method could provide. (Gottfried Leibniz, 1706 11 ) 

What it isn’t: 

This project is not about using logic for discovery, as in the axiomatic method. And it is 
not concerned with developing general or elegant mathematical theories (including theories 
presented as “logics”). An important premise of this project’s approach is that an informal 
yes/no question must be set before formalization begins (though it may later be modified), 
and only making progress on the question matters (where a more-precise formulation of 
the question is progress). Of course, abstracting out common axiomatizations for reuse is 
still a good idea, but as with writing software libraries, it should not be done preemptively. 
This preoccupation with constructing relatively-elegant, widely-applicable theories, is one 
of two factors to which I attribute the lack of success of the Logical Positivists’ project, 
the other being their focus on questions outside of the Problem Domain (Section 2.1). 

This project is fundamentally normative. It is not concerned with descriptive modelling 
of argumentation, as in abstract argumentation systems [PralO]. There is no interest here 
in modelling real legal reasoning, for example, nor in assisting it. But there is great interest 
here in depicting what legal reasoning should be like in an ideal environment with lots of 

11 From translation of a letter to Sophia of Hanover [LCSll] 
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time available (e.g. for severe criminal cases, in which there is the rest of the convicted 
person’s life to argue that they are innocent with sufficiently-large probability 12 ). 


1.2 Role of formal logic 

Remove formal logic from this project and there is no benefit over our current system 
of arguing with each other through papers and blog posts and shouting. Those mediums 
are easier to work in, and superior if one is interested in persuasion. Indeed, interpreted 
formal proofs (the format for arguments and criticisms advocated in this thesis - see 
Section 2.2) can be made more persuasive to most people if converted to an informal 
argument that mixes natural language and mathematics in the normal way we use in 
conference and journal papers. The problem with that, from the point of view of this 
project, is that unsound, invalid, misleading, unfair, and otherwise bad arguments benefit 
from the lax regulations as much or more than good arguments do. This project uses 
deductive formal logic because it is our best tool for forcing the weaknesses of arguments 
to be exposed. Thus the role of formal logic in this project is regulatory, and 
nothing more than that. The success of this project rides on the regulatory benefit 
outweighing the overhead of formalization. 

I have caught omissions in my own reasoning thanks to the constraints of formal 
deductive logic, things that never occurred to me in thinking and talking about an issue 
for years, resulting in my having to temper my opinion, sometimes temporarily and 
sometimes long term. I have gained respect for my opponents on every issue that I have 
attempted arguments about, having been forced to consider all the subtle details (e.g. 
Canada’s lifetime ban against blood donations from men who have had sex with men, 13 
assisted suicide in Canada, the evidence for anthropogenic global warming 14 ). It is hard 

12 Noting that a convict is adversely effected for the remainder of their life, even if they are released. 

13 The efficacy of our system for preventing contaminated blood from ending up in the stock of blood 
donations relies not just on tests for HIV, Hepatitis, etc, but also on self-disclosure of known infections 
and risk-factors. If the ban is lifted, is there non-negligible probability of a significant increase in the rate 
of people lying in the self-disclosure part of the system? Reasoning deductively, one must consider this 
non-obvious question, and I have found no way to derive my target conclusion (that, with additional 
safeguards, the ban should be lifted) without making an assumption that is not far from explicitly 
answering the question c no\ We could imagine, for example, that after lifting the ban, some homophobic 
HIV-positive person intentionally donates blood in retaliation. That may seem far-fetched, but it must 
nonetheless be ruled out (with high probability), one way or another. That strictness imposed by formal 
deductive logic should be reassuring. 

14 The closest thing we have to a strong deductive argument, that I have found, comes from the Berkeley 
Earth Surface Temperature project, which has mostly been ridiculed by climate science researchers, who 
simply view it as making no significant advancement in climate science, ignoring or not valuing the fact 
that it seeks to minimize the use of argument from expert opinion. 
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to write a person off as ignorant or stupid, relative to oneself, after a great struggle to 
find acceptable formal assumptions from which it follows logically that they are wrong. It 
is hard to overstate this advantage of rigorous deduction. Among other things it provides 
a force for compromise - a force in the same democratic spirit as “I could be wrong”, but 
with much greater discriminatory power. 

1.3 Preface to examples 

On one hand, the better the quality of the example arguments in Chapters 4-8, the more 
seriously this project and its theoretical ideas will be taken. There is some basis for that; 
the expected value of the project and theoretical ideas are much harder to assess, so 
instead one might choose the heuristic of assessing the author, via the remaining material. 

I am urging you to resist that temptation. These are not exemplars of interpreted 
formal proofs. I am not a genius, nor a real statistician, nor an expert in criminal law 
or human biology. I am not even an especially good mathematician. But once the web 
system (Chapter 9.1) is ready, so that the project has as good a chance at gaining traction 
as possible, I will persuade some such people to collaborate with me on new arguments or 
to author their own. Much, much better examples are still to come. 

This, like almost any thesis, is not intended to be read from start to finish. You are 
encouraged to skip ahead after reading at least Sections 2.1 and 2.2. The most complete 
and accessible example is the Sue Rodriguez argument, which should be read in HTML, 
though a static version is given in Chapter 4. The most complete and accessible example 
written in RT^X is Chapter 5. 

1.4 Related Work 

Unfortunately this project does not fit well within any current area of research. On the 
other hand, it would be impossible without certain firmly established work, especially 
the fundamentals of classical predicate logic and Bayesian reasoning/statistics. I have 
chosen to cite related work predominately when it is contextually relevant, throughout 
this thesis. E.g. in Section 1.1 I briefly talk about the held of Abstract Argumentation 
Theory, which was not helpful in this project, and in Section 9.1.1 I briefly cover one of 
the implemented projects that has come out of the Informal Logic community. 
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1.4.1 Logical Positivism, Logical Empiricism, and Analytic Phi¬ 
losophy Since the Early 1900s 

In general in analytic philosophy, there has been an immense amount of writing about 
the hypothetical application of formal logic to vague and subjective issues, but very little 
of the application itself. 

The first scholars to have the requisite technical framework of predicate logic, and to 
attempt to expand its scope to matters outside of traditional mathematics, were analytic 
philosophers of the early-to-mid 1900s, especially those associated with the movements 
of Logical Positivism and Logical Empiricism. Unusually in philosophy, there seems 
to be a general agreement that they failed. See, for example, The Heritage of Logical 
Positmsm [Res85]. Before explaining why they failed, it is important to note that they 
didn’t fail at this project. Geoffrey Sayre-McCord writes in Logical Positivism and the 
Demise of “Moral Science” 15 : 

... most of the Logical Positivists were convinced that moral theory is nonsense. 

They thought their arguments showed that there really is no such thing as 
“moral science.” Moral language, they maintained, is not used to report facts, 
rather it is simply a tool used to manipulate the behavior both of ourselves 
and of others. 

That is, most of the positivists shied away from working on questions affected by moral 
relativism, instead spending their formalization efforts on questions from science. 

Considering the few analytic philosophers who made serious attempts to reason in 
formal logic about specific problems outside of mathematics, I attribute their lack of 
progress to two main factors: 

1. Preoccupation with constructing elegant, widely-applicable theories. A premise of 
my approach is that an informal yes/no question must be fixed before formalization 
begins. Hence, the development of general theories about subjective and vague 
matters is explicitly not a goal; only making progress on the question matters. Of 
course, abstracting out common axiomatizations for reuse is still a good idea, but 
as with writing software libraries, it should not be done preemptively. 

2. Working with examples outside of the problem domain I outlined in Section 2.1. 
Because there is less promise of discovering mathematically interesting material 
in the formal investigation of a question about a vague and subjective issue, the 
motivation for the very difficult work involved in rigorous reasoning must come 
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entirely from elsewhere, namely from the question itself; we must be convinced that 
there is no easier way to make progress on the question, and that making progress 
on the question is indeed worth the work. 

1.4.2 Mathematics and Theoretical Computer Science 

The stigma in mathematics against working toward progress on value-laden issues has 
grown over time. Doing so gets one’s work labeled as philosophy. The stigma is not 
surprising, given what has passed as good work in contemporary philosophy, but on the 
other hand it is a clear fallacy of association to condemn a subject of study on account of 
the people who have managed, so far, to get paid to work on it. Nonetheless, disrepute is 
the current state of things, and has been for many decades. 

As a consequence, the most related work in mathematics differs from the work required 
for this project in one very important respect: It is intensionally reductionist. We find this 
in Game Theory and Decision Theory, for example. In these fields, real social problems 
are used to inspire and motivate interesting mathematical problems, but little more than 
that, aside from extremely rare situations when the simplifying assumptions of models 
are met, and countless disastrous situations in which the mathematics is used without 
meeting the simplifying assumptions. 

I should mention, however, that these fields are progressing, with the simplifying 
assumptions for some problems becoming more and more palatable. Perhaps we will see 
strategy-proof mechanisms employed for kidney exchange programs in the future[AFKP13], 
for example. Though, I doubt it. I expect that researchers will continue to ignore factors 
that would make their model too messy and unwieldy, and ignore possible changes to their 
models that would result in mathematically-uninteresting solutions to the real problem 
(since then there would be nothing to publish). 16 

1.4.3 Informal Logic, Defeasible Reasoning, and Intentional Log¬ 
ics 

The idea to adopt an asymmetric dialog system (author vs critics) came from work on 
argumentation systems in Informal Logic; see [WK95] for one of many sources. 

16 Minimizing the number of people who die waiting for kidney transplants, for example. The sophisti¬ 
cated algorithms found in the Algorithmic Game Theory literature implicitly use a model in which, for 
example, there is no possibility of legislative solutions that mandate participation in a kidney exchange 
(say, for the hospital to be eligible for federal funding) while criminalizing or penalizing lying (see previous 
source for details about why hospitals might lie). 
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Chapter 3 is devoted to explaining the inappropriateness of defeasible logic for this 
project, and Section 2.3 does the same for intentional logics. All work that I have found 
on argumentation about subjective and vague questions uses defeasible or intentional 
logics. The vast majority of it is concerned with the construction of formal systems 
for hypothetical use (i.e. never seriously applied) in reasoning about subjective and 
vague questions. Unfortunately, these systems are not useful for the task of writing an 
isolated deductive argument, as they come with the overhead of their own syntax and 
semantics, and because, as I hope the examples I provide will convey, one cannot expect 
to find a formal system of axioms that is completely adequate for one’s argument; each 
argument requires at least a slightly new system in order to be formulated in the most 
natural way, and formulation in the most natural way is vital for approaching the goal of 
locally-refutable proofs (page 6). 
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Chapter 2 

Proofs and critiques 


2.1 Problem domain 

Provided the uncertainty involved in a problem is not too great, or that it is too 
great but one side of the argument has the burden of proof, it is my view, from several 
years of working on this project and a decade of thoughts leading up to it, that the 
main impediments to rigorous deductive reasoning about socially relevant issues are 
a) conventional mathematical modeling difficulty; b ) conventional mathematical problem 
solving difficulty 1 ; and c) tedium 2 . These are strong impediments. For that reason, I think 
it is worthwhile to describe the questions that I think are best-suited for rigorous deductive 
reasoning. These are contentious questions with ample time available. Typical sources of 
such problems are public policy and law. Without ample time, it may be detrimental to 
insist on deductive reasoning; as pointed out in many places, when complete heuristic 
reasoning and incomplete deductive reasoning are the only options, it is probably best 
to go with the former. Without contentiousness, there is little motive for employing 
fallacious reasoning and rhetoric to advance one’s position, and this, I think, defeats 

x Two of my proofs, the Leighton Hay argument and the smoking/cancer argument, are currently 
contingent on the truth of mathematical statements that I cannot easily prove. This is my attitude 
about such statements: there are mathematicians out there who can easily prove or disprove them, 
but I think it would be premature to call upon them until proofs of the statements have actually been 
demanded by critics (called a mathematics detail criticism in the paper). In the meantime, I give some 
empirical evidence of their truth (in this case, numerical evaluation of a complicated integral, without 
error bounds). Most importantly, there are other, more-subjective axioms of the proof that are much 
easier targets for criticism. It may even be wise to build in some precedence in the rules for criticizing 
an interpreted formal proof, whereby under certain conditions (which aren’t obvious to me at present) 
one must accept the axioms that involve vague and subjective concepts before demanding a proof of a 
purely-mathematical claim (of course, one should always be able to present disproofs). 

2 This has been the hardest of the three for me to cope with. My hope is that this impediment will be 
reduced by making the construction of such arguments a collaborative, social process on the web, with 
an editor having auto-suggestion and other features of modern IDEs (Section 9.1). 
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much of the benefit of using formal logic (or some approximation of it, as appears in 
mathematics journals). At the same time, lack of contentiousness does not proportionally 
reduce the work required for rigor, so we are left with less expected benefit relative to 
cost. Leibniz was conscious of this point: 

I certainly believe that it is useful to depart from rigorous demonstration in 
geometry because errors are easily avoided there, but in metaphysical and 
ethical matters I think we should follow the greatest rigor, since error is very 
easy here. Yet if we had an established characteristic 3 we might reason as 
safely in metaphysics as in mathematics. 

(Gottfried Leibniz, 1678, Letter to Walter von Tschirnhaus[LL76]) 

In contrast, some prominent Logical Positivists seem to have thought that this is not a 
crucial constraint (e.g. Hans Reichenbach’s work on axiomatizing the theory of relativity). 

2.2 Interpreted proofs and critiques 

This section defines an elaboration of a kind of document that most teachers of first-order 
logic have used at least implicitly. The point is just to make concrete and explicit a 
bridge between the formal and informal, providing a particular way, which is amenable to 
criticism, for an author of a proof to describe their intended semantics in the metalanguage. 

The definition of interpreted formal proof is tailored for classical many-sorted first 
order logic, but it will be clear that a similar definition can be given for any logic that 
has a somewhat Tarski-like semantics, including the usual untyped classical first order 
logic, or fancier versions of many-sorted first-order logic. 4 A very minor extension of 
the usual definition of many-sorted first order logic (where sorts must be interpreted as 
disjoint sets that partition the universe) with easily-eliminable sort operators is used here 
and in the examples. A language is just a set of symbols, each of which is designated a 
constant, predicate, function, sort, or sort-operator symbol. A signature is a language 
together with an assignment of types to the symbols (or, in the case of sort operators, 
just an assignment of arities). 

There are four kinds of formal axioms that appear in an interpreted formal proof: 

3 Leibniz is referring to the practical system/method that he envisioned, but was unable to devise. 

4 Earlier versions of this thesis included the syntax and semantics of such a fancier logic. That logic 
is a little more convenient for formalization, but I discarded it because it introduces another barrier to 
entry for users of the system, and because its reduction to many-sorted first order logic -the language of 
resolution theorem provers- introduced too many usually-unuseful axioms that drastically slowed down 
proof search. 
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• An assumption imposes a significant constraint on the semantics of vague symbols 
(most symbols other than common mathematical ones), even when the semantics of the 
mathematical symbols are completely fixed. 

• A claim does not impose a significant constraint on the semantics of vague symbols. It 
is a proposition that the author of the proof is claiming would be formally provable 
upon adding sufficiently-many uncontroversial axioms to the theory. 

• A simplifying assumption is a special kind of an assumption, although what counts as 
a simplifying assumption is vague. The author of the proof uses it in the same way as 
in the sciences; it is an assumption that implies an acknowledged inaccuracy, or other 
technically-unjustifiable constraint, that is useful for the sake of saving t im e in the 
argument, and that the author believes does not bias the results. 

• A definition is, as usual, an axiom that completely determines the interpretation of a 
new symbol in terms of the interpretations of previously-introduced symbols. 

A language interpretation guide g for (the language of) a given signature is simply 
a function that maps each symbol in the language to some natural language text that 
describes, often vaguely, what the author intends to count as an intended interpretation 
of the symbol. Due to the vagueness in the problems we are interested in, a set of axioms 
will have many intended models. Typically g(s) will be between the length of a sentence 
and a long paragraph. 

A signature’s language has sort symbols , which structures must interpret as disjoint 
subsets of the universe. A language can also have sort operator symbols, which are second 
order function symbols that can only be applied to sorts. In this project sort operators 
have a nonvital role, used for uniformly assigning names and meanings to sorts that are 
definable as a function of simpler sorts, when that function is used multiple times and/or 
is applied to vague sorts (i.e. sorts in £ vague , introduced below). 5 A signature assigns 
sorts to its constants, and types to its function and predicate symbols. In this project, 
types are mostly used as a lightweight way of formally restricting the domain on which 
the informal semantics of a symbol must be given (by the language interpretation guide). 
To see why they are beneficial, suppose that we didn’t have them, e.g. that we were 
using normal FOL. For the sake of clarity, we would nonetheless usually need to specify 
types either informally in the language interpretation guide, or formally as axioms. In 
the first case, we inflate the entries of the language interpretation guide with text that 

5 For example, if our proof only needs the power set of one mathematical sort S (in £ rigid ), then using 
a sort operator would have little benefit over just introducing another mathematical sort symbol named 
2 s . Arguably one cannot say the same if S' is a vague sort (in £ vague ), since then we would have to 
introduce 2 s as a vague sort as well, and there is some value for readers in minimizing the number of 
vague symbols when possible. 
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rarely needs to be changed as an argument progresses, and that often can be remembered 
sufficiently after reading it only once. In the second case, we clutter the set of interesting 
axioms (e.g. the non-obvious and controversial axioms) with uninteresting typing axioms. 

A sentence label is one of {assum, simp, claim, defn, goal}, where assum is short for 
assumption and simp is short for simplifying assumption. A symbol label is one of 
{vague, math, def}. 

An interpreted formal proof is given by the following components. Intuitively, £ rigid is 
for symbols that should have the same interpretation in all models. 

• A signature E. 

• A set of well-typed E-sentences T called the axioms. 

• An assignment of symbol labels to the symbols of E. If L is the language of E, then 
for each symbol label l we write L l for the symbols assigned label l. 

• An assignment of sentence labels to the elements of F, with one sentence labeled goal. 
For each sentence label l we write Ti for the sentences in T labeled l. 

• An assignment of one of the sentence labels assum or simp to each type assignment of 
E. These typing declarations can be viewed as sentences too, and though they will 
usually be regular assumptions (labeled assum), occasionally it’s useful to make one a 
simplifying assumption (labeled simp). 

• The sentences in r de f n define the constant, function, and predicate symbols in £ def . Func¬ 
tion and predicate symbol definitions have a form like Vxp.Si .V.ry:,S).. f(x \,..., xjf) = 

t where t can be a term or formula (in the latter case, replace = with <-»•) and the S) 
are sorts. 

• ^vague) £ rigid > "^def are disjoint languages, £ vague does not contain any sort-operator 
symbols, 6 and £ def contains neither sort nor sort-operator symbols 7 . 

• g is a language interpretation guide for a subset of the language of E that includes 

vague an< l £ rigid- So, giving explicit informal semantics for a defined symbol is optional 8 . 

• Optionally for each axiom, a natural language translation of the axiom. 9 

6 I suppose that restriction could be lifted, but I haven’t had any desire for vague sort operators in all 
the time I’ve worked on this project. 

7 Another inessential constraint, which I’ve added simply so that I don’t have to include something in 
the grammar for defining sorts or sort-operators in terms of other sorts and sort operators 

8 This may change after experience with criticizing proofs on the web (Section 9.1), as the intended 
semantics of a defined symbol can be very obscure relative to the intended semantics of the symbols used 
in the definition, despite the fact that the former semantics is completely determined by the latter. 

9 This was added late to the definition of interpreted formal proofs, as it appears to introduce another 
source of semantics problems (see “translation criticism” below). The dilemma is that natural language 
translations will be demanded by readers of interpreted formal proofs as they are usually easier to read, 
and simply positing that such translations, when given, should not be trusted for veracity, will not make 
it so. If this becomes a problem, adding features of English to the logic may be useful. On the other 
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* Tgoal IS provable from Tassum U 1 Tsimp U 1 Tclaim U 1 Tdefn- 

• For each 0 e r c | a j m , any reader in the intended audience of the proof can come up with 
a set of --C r | gid -sentences A, which are true with respect to the (informal) semantics given 
by g , such that r assum u r^fr, u r s j mp u A proves 0. The first paragraph of the next 
section gives a more-precise condition. 

2.2.1 Criticizing interpreted formal proofs 

£ rigid is intended to be used mostly for established mathematical structures, but in general 
for structures that both sides of an argument agree upon sufficiently well that they are 
effectively objective with respect to r c | a j m . For each person p in the intended audience of 
the proof, let A p be the set of £ rigid -sentences that p can eventually and permanently 
recognize as true with respect to the informal semantics given by g. Then we should have 
that Pi A p is consistent and when combined with r assum u Tdefn u r s i mp proves every 

pe audience 

claim in r c | a | m . If that is not the case, then there is some symbol in T rigid that should be 
in £ vague , or else the intended audience is too broad. 

The purpose of the language interpretation guide is for the author to convey to readers 
what they consider to be an acceptable interpretation of the language. Subjectiveness 
results in different readers interpreting the language differently, and vague¬ 
ness results in each reader having multiple interpretations that are acceptable 
to them. Nonetheless, an ideal language interpretation guide is detailed enough that 
readers will be able to conceive of a vague set of personal X -structures that is precise 
enough for them to be able to accept or reject each assumption (element of r assum u r s j mp ) 
independent of the other axioms. When that is not the case, the reader should raise a 
semantics criticism (defined below), which is analogous to asking “What do you mean by 
X?”. 

In more detail, to review an interpreted proof n with signature X and language L, 
you read the language interpretation guide g, and the axioms F, and either accept n or 
criticize it in one of the following ways: 

(1) Semantic criticism : Give (f) e T and at least one symbol s of £ vague that occurs in 
0, and report that g(s) is not clear enough for you to evaluate 0, which means to 
conclude that all, some, or none of your personal X-structures satisfy 0. If you cannot 
resolve this criticism using communication with the author in the metalanguage, 
then you should submit a X-sentence 0 to the author, which is interpreted by the 


hand, see Footnote 10 on page 6 about how this is a special case of language interpretation guide entries 
for defined symbols. 
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author as the question: Is 0’s truth compatible with the intended semantics given 
by 

(2) Classification criticism : Criticize the inclusion of a symbol in £ rigid , if necessary by 
doing the same as in (1) but for £ rigid . This is the mechanism by which one can 
insist that vague terms be recognized as such. The same can be done when 0 is a 
type assignment or sort constraint, in which case 0 is a E-sentence that uses sort 
symbols as unary predicate symbols. 

(3) Mathematics detail criticism : Ask for some claim in r c | a j m to be proved from simpler 
claims (about £ rigid interpretations). 

(4) Subjective criticism: Reject some sentence 0 e T assum u T S i mp , which means to 
conclude that at least one of your personal £-structures falsifies 0. If you wish to 
communicate this to the author, you should additionally communicate one of the 
following: 

(a) Strongly reject 0 : Tentative commitment to —>0, i.e. that all of your personal 
E-structures falsify 0. 

(b) Weakly reject 0 : Tentative commitment to the independence of 0, i.e. that 0 
is also satisfied by at least one of your personal E-structures. Intuitively, this 
means that 0 corresponds to a simplifying assumption that you are not willing 
to adopt. 

(5) Translation criticism : Criticize as misleading the informal natural language text 
attached to an axiom. 10 

In the context of its intended audience, we say that an interpreted formal proof is 
locally-refutable if no member of the intended audience raises semantic or classification 
criticisms when reviewing it. A locally-refutable proof has the desirable property that 
by using the language interpretation guide g , any member of the audience can evaluate 
each of the axioms of the proof independently of the other axioms. Local-refutability 
is the ideal for interpreted formal proofs. It is a property that is strongly lacking in 
most mathematical arguments in economics or game theory, for example, and in every 
sophisticated abuse of statistics. When an argument is far from locally-refutable, it is 
hard to criticize in a standard, methodical manner, and that ease of criticism is a central 
goal of this project. 

10 This is actually a special case of criticizing the optional semantic description attached to a definition, 
since one can always replace an axiom A with a new defined 0-ary predicate symbol Pa <-> A and the 
new axiom P,\. 
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2.3 Choice of logic 

For this project I use a superficial extension of classical many-sorted first-order logic 
(MSFOL), itself a superficial extension of classical first-order logic (FOL), and for this 
particular project -rigorous deductive reasoning about the kind of questions described in 
Section 2.1- I believe that is the right choice: 

Claim: Nothing simpler than classical FOL will suffice, and nothing signifi¬ 
cantly more complex is necessary. 

Note 1: Though there are many vocal advocates of nonclassical logics, the scope of 
the previous claim -this project only- is narrow enough that there may be no serious 
dispute of it, putting it outside of the problem domain described in Section 2.1 due to 
lack of contentiousness. I will therefore not put the effort into giving a rigorous deductive 
argument for the claim, and indeed my argument will be neither rigorous nor deductive. 
That said, I believe that I could find such a formalization without significantly weakening 
my position in the process. 

Note 2: I do believe that the claim holds more broadly -that nonclassical logics 
should generally be framed and thought of as mathematical theories, as opposed to 
the grandiose framing as alternatives to classical logic 11 - but it is not relevant to this 
project to argue that here. Maria Manzano in [Man96] gave arguments in support of 
this, demonstrating in one book that second-order logic, type theory, modal and dynamic 
logics can be naturally and usefully simulated in MSFOL. Her book appears to have 
been mostly ignored by researchers in applied nonclassical logic, and according to [Ven98] 
she made too strong a claim about the usefulness of translations to MSFOL. I do not 
advocate that researchers should use the syntax of MSFOL in all situations, since often a 
custom syntax is more concise, and allows for more-natural statements of metatheorems 
and simpler programming of automated theorem proving tools (e.g. if one is programming 
a decision procedure for some version of propositional temporal logic, it would be silly 
and inefficient to use a problem encoding with bound variables). 

Back to the topic of this section: the choice of logic for this project. I will focus on the 
second part of the above Claim, as arguments about the need for the expressiveness of 
FOL, in particular its predicate symbols and variables, are commonplace. 

The main factor compelling the use of FOL is this: It is vital for this project that the 
interface between syntax and semantics is as simple and transparent as possible, and FOL 

11 Which has social implications that I won’t go into, e.g. giving outsiders the impression that formal 
logic is controversial or in flux. 
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with Tarski semantics is the best in this respect. I know of no substantial dispute of the 
relative simplicity and transparency of Tarski semantics. There are criticisms of material 
implication, of course, but such criticisms stem from misuse of classical logic, in particular 
use motivated by a desire to extract or impose some sort of meaning from validity 
that is different from its characterization in terms of truth-with-respect-to-structures. 
Unfortunately, there are no official directions for the use of FOL, but if there were, they 
would clearly imply that to reason formally using other sorts of implication, e.g. relevant 
implication, causality, or conditional probability, one should develop a mathematical 
theory of relevant implication, causality, or conditional probability, and formalize it as a 
first-order theory. Since such mathematical theories, without fail, are more complex and 
nuanced than Tarski semantics, we have no substantial dispute of the relative simplicity 
and transparency of FOL with Tarski semantics. 

The second factor compelling the use of classical FOL, part of which is alluded to in 
the previous paragraph’s counter-criticism of criticisms of material implication, is this: 
extensions or similarly-expressive alternatives to FOL are either (1) unuseful or conflicting 
with the goals of this project, or (2) useful and compatible with the goals of this project, 
but simulating their benefits for specific proofs is not hard, and so the added complexity 
of complicating the logic is not justified. This claim is clarified by the following 
Hypothesis: 

Let L be any nonclassical alternative to MSFOL with its own semantics. I hypothesize 
that the difficulty of converting a proof in L to an equally-readable proof in MSFOL, 
or of extending the definition of MSFOL to accommodate the useful features of L 12 , is 
proportional to the difficulty, compared to MSFOL, of interpreting sets of L-sentences 
using L's semantics. Furthermore, whenever there is no added difficulty of interpreting 
sets of sentences, or when the features of L allow us to write easier to interpret (sets of) 
sentences, 13 then I hypothesize we can already simulate those features with low overhead 
in MSFOL, and/or we can easily extend our definition of MSFOL to accommodate those 
features, without straying much from Tarski semantics (a “superficial extension”). 

I should clarify that the Hypothesis is not an argument against the study of defeasible 
and nonclassical logics in general, because its force depends on some uncommon aspects 
of this project: 

• Sentences about vague, subjective concepts and uncertain knowledge are by nature 
already more difficult to interpret than sentences about traditional mathematical 

12 Which I have done, and undone, several times, ending with MSFOL plus type operators. 

13 As I believe is the case for type operators (AKA parametric polymorphism), subtyping, and [Far93]- 
style partial functions, although the latter two features have drawbacks when it comes to proof search 
using currently existing theorem provers. 
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concepts and certain knowledge. 

• With the top-down, minimally-reductionist approach to proofs that I advocate, 
there are fewer syntactic definitions (more non-defined symbols) and more axioms, 
making ease of interpretation more important than it usually is in applications of 
formal logic. 

• Theory-construction is not a goal; we only care about individual proofs. Because of 
this, simulation of the used features of another logic L can be proof specific, often 
making the task much easier than a general translation of all L- sentences. 

In Chapter 3 I will explain why the arguments used to motivate defeasible logic do not 
apply to this project, and give some minimal examples of formalizing defeasible reasoning 
in deductive logic. 14 In contrast to there, where I argue that there is nothing to be gained 
from using defeasible logic for this project, here I will briefly argue that something would 
be lost. 

Claim 1: The requirements of formal deduction make interpreted formal 
proof versions of all defeasible arguments less psychologically persuasive, 
though bad defeasible arguments suffer worse than good ones. If defeasible 
logic were permitted, there would be much less incentive to ever do the extra 
work required for formal deduction, as bad defeasible arguments can be more 
psychologically persuasive than good deductive ones. 

Claim 2: Argumentation with defeasible logic is not profoundly different 
from natural language argumentation (e.g. see Section 9.1.1 about the online 
defeasible argumentation system Carneades), whereas argumentation with 
deductive logic is. The novelty of interpreted formal proofs about socially- 
relevant issues is an essential motivation for this thesis, so asking whether 
defeasible logic should be used reduces to asking whether this thesis should 
be written at all. Thus, if you are convinced that this thesis was adequately 
motivated, then you should be convinced that declining defeasible logic was 
adequately motivated as well. 

The remainder of this section, far from considering all alternatives to FOL other than 
defeasible logics, is devoted to a consideration of modal logics. This is for concreteness, 
and because, as I said earlier (page 2.3), it is not clear that a deductive argument for 

14 The major interpreted formal proofs that take up the bulk of this thesis contain more complex 
examples, although they are not labeled as such. 
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why I’ve chosen FOL is called for. Moreover, many of the popular extensions of FOL are 
modal logics, and modal logics are especially popular in philosophy, which is the held that 
has historically taken the greatest interest in formal reasoning about contentious issues. 

The case against modal logic for this project comes down to two points: 

(i) Modal logic introduces another syntactic device by which semantically-complex 
sentences can be written in very simple terms. Note that this property is the main 
advantage of modal logic for other applications (other than this project). Note also 
that FOL (and modal first-order logic, by extension) is not without the capacity to 
disguise semantically-complex sentences, namely by using deeply nested definitions 
(but see Footnote 8 on page 4 for discussion of how to address that). 

(ii) Modal logics are easy to simulate in MSFOL, in a natural way, by formalizing 
possible-world semantics. Hence, in the worst case we would be writing slightly 
more-verbose sentences (and even that can be mitigated using syntactic definitions). 
Also, more can be expressed in the translated MSFOL language than can in the 
language of modal logic. 

First let’s look at the general simulation, and after that I will use an example to illustrate 
(i). This is a standard simulation that can be found, in more detail, in [BdRV02], 

Let £ be a signature for a many-sorted first-order modal logic with one or more 
□-like modal connectives Ql, ... and with corresponding 0-like connectives 0 j- The 
corresponding MSFOL signature £' for the simulation is the same as L except: 

• It has an extra sort W for worlds. The variables w, Wi, w 2 ,... are reserved for W. 

• For each D-like modal connective Q,-, it has an extra predicate symbol Rj : W 
® for the corresponding reachability relation. 

• Each function or predicate symbol / in £ whose domain type is S\ x ... x S n has 
domain type W x Si x ... x S n in That is, an extra argument is added for the 
world with respect to which the predicate or function is being evaluated. 

Then ^.-sentences are easily translated to £'-sentences by the following function r - n , which 
given an ^-formula A produces an £' formula with the same number and sorts of free 
variables as A, plus exactly one free variable w of sort W. The syntax t[w *—» it/] means 
to substitute variable w' for free occurrences of variable tv in t. 

• r P(ti, ■ ■ ■ ,tn = P{u>, r ti', ■ ■ ■, r tn ) for P a predicate symbol, including =. 

• r f(t\, ■ ■ ■ An = f(w, r tA ,..., T n n ) for / a function symbol. 

• r A => Bd = r A' => r PT and similarly for the other boolean connectives. 
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• U-4 n = Vu',;. Rj(wi,w ) =^> r /F[;/' i—> wi\, where w t does not occur in A. Thus, n r 4 
holds in the current world w iff A holds in every world that is /^-reachable from w. 
Note that in most presentations of frame semantics, the order of the arguments to 
the reachability relation R is swapped. I use this order so that we can display the 
translated formula more neatly as VwiRjW. r .4 n [w <—>■ «>,]. 

• r 0 jA 1 = 3 Wi. Rj(wi,w ) a r A'\w h-> (i'j], where uj, does not occur in A. We can 
display such a formula more neatly as 3 WiRjW. r A' \w >->■ w;,]. 

For any modal logic L that can be characterized by frame semantics, I claim there is 
a recursive (and usually finite) set of £'-sentences T ^ that capture validity for L. That 
is, such that for any ^-sentence B and set of ^-sentences Ai, A 2 ,..have 

A 1 ,A 2 ,...j= L B iff Vv, Vw. r A 1 \Vw. r A 2 -',...t=Vw. r B' 

where |=l is entailment for L and |= is entailment for MSFOL. 

Note that, according to the definition of interpreted formal proof, language interpreta¬ 
tion guide entries must be given for the worlds sort W and the reachability relations R, . 
Those components of the modal logic semantics are usually not given explicit intended 
interpretations in applications of modal logic in philosophy, and I claim this is the only 
reason why modal logic “paradoxes” in philosophy do not get resolved; the intended 
semantics remains too vague. 

As an example, I will use Fitch’s Paradox of Knowability. It is a proof in the language 
of a propositional modal logic with either one D-like modal connective K or two D-like 
modal connectives K and □. I will use the latter formulation, since it is easy to convert 
to the former by identifying K and Q In either case, only one 0-hke connective is used, 
and it is connected to □. The axioms are substitution instances of the following axiom 
schema, where <i>, <pi, (f> 2 are metavariables that range over formulas, except for schema 
(□-Valid) where they range over provable formulas. In every presentation of the proof 
that I have seen, an (overly-simple) English translation of each sentence is provided. I am 
giving the ones from [CC14], except for (A'O-connection) for which I give two versions. 
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(AT-Factivity) 

(K a -Distributivity) 
(□0-Connection) 


(Knowability Principle) 


A0 => 0 “If a proposition is known, then 

it is true.” 

K((pi a 02) => K(J )i a K<p 2 “If a conjunction is known, then 

its conjuncts are also known.” 

□(-■0) =^> (-’00) One direction of “□~ , 0 is logically 

equivalent to —>00.” The SEP 
article[BS13] translates it as “necessarily 
-■0 entails that 0 is impossible.” 

0 => OAT0 “Every truth is knowable.” 


(□-Valid) 


□0, if 0 is provable from AT-Factivity and AT a-D istributivity 


From those axiom schema and the rules of classical propositional logic, it follows that 
for any formula 0, 

(G) 0 => Ko “Every truth is known” (the unwanted conclusion) 

I will argue that the proof is circular when K is S5-like 15 , whereas in weaker modal 
logics G simply does not mean “Every truth is known,” and the meanings of the (Knowa¬ 
bility Principle) and (A'-Factivity) are far from clear (see second-to-last paragraph of 
this section for why they are unclear). My position is that the ongoing philosophical 
discussion about this proof is completely reliant on underspecified semantics. 

Since the translation I gave is for first-order modal logic, I’ll briefly describe how 
to convert to that form. Any nonlogical language can be used, since the proof is really 
a meta-level proof, but it is convenient to use a minimal language with a single 0-ary 
predicate symbol P , in which case we can take the goal sentence G to be P => I\P. Then, 
the substitution instances of the above axiom schema needed to derive G and justify the 

15 More accurately, when the corresponding reachability relation Rk is trivial in that every world is 
reachable from every other. 
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instance of (□-Valid) 16 are: 


(/v-Factivity) 

(K a -Distributivity) 
(□0-Connection) 
(Knowability Principle) 
(□-Valid) 


K^KP => -KP 

K(P a - KP ) => KP a K-KP 

□(-K-G) =* (-0K-G) 

-G^ OK-G 
□-K-G 


And the universal closures of the translations of those axioms into MSFOL prove Vw. r G\ 
In fact, it is easy to see that the universal closures of the translations of (.K A-Distributivity) 
and (□0-Connection) are theorems of MSFOL, and (□-Valid) can be made a lemma 
provable from the other axioms, since it is a logical consequence of the (Knowability 
Principle) and (□0-Connection) 17 . Therefore, I will give the translations only for the 
remaining two axioms, since they are the only ones that can be criticized. Note that 
r G n = P(w) => Mw'R k w.P(w'). 

1. Mw. [VwiRkw. -'MwzRrWi- P(m 2 )] => —Nw'Rkw. P(w r ), i.e. 

Vie. [VwiRkw. 3w2RkW\. -^(m^)] => 3 w'Rkw. ~^P(w > ). Suppose that for every 
world W\ that is K-reachable from the current world, there exists a world m 2 that is 
A'-reachable from v>\ in which P is false. Then there exists a world /v-rcachable 
from the current world in which P is false. 

2. Mw. — r G n => 3wiRqw. Mw2RrWi- _,r G n [m <-*■ w 2 ] 


For the class of models that satisfy Vwi, w 2 . Rk(wi, w 2 ), those axioms are equivalent 
to 

1. —Vw. P(w ) => —Nw. P(w), a tautology. 

2. Vw. -' r G 1 => (Nl n (m) a Vw 2 .- ,r G 1 [w •—* m 2 ]) where Nl n (m) (‘not isolated’) abbre¬ 
viates 3wi. Rq(wi,w). For the models under consideration, Vm 2 .^ r G 1 [m h-> m 2 ] is 
false, so the consequent in the implication is false, and the sentence further reduces 
to Vw. r G\ 

So, for that class of models the (Knowability Principle) instance is equivalent to G, i.e. 
the argument is circular. 

Thus, we can assume that at least some of the author’s intended models of the axioms 
falsify Vuq, m 2 . Rk(w 1 , m 2 ). But then we can no longer collapse quantifiers; the two axioms 

16 That is, the third and fourth axioms prove —'K—G, which is the prerequisite needed to use □—AT—' G 
as an axiom. 

17 Using the fact that if \/vj.<i>(iu) is valid for any MSFOL-formula <fi, then \/w.\/w'Rqw.<P(w) is valid 
also. 
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1. Mw. [VwiRkw. 3w 2 Rk'Wi- -, -P(u'2)] => 3w'Rkw. ~^P{w'). 

2. Vw. -FGP =^> 3wii? n w. Mw2Rk^i. —■ r C ? n [id h-> w 2 ] 

cannot obviously be simplified. And without having more-precise semantics for Rk, Rq, 
and W, I claim one cannot evaluate whether those axioms are satisfied for the author’s 
intended interpretations. Now let’s see why it would be hard to give any interesting 
interpretation of those axioms. The (/v-Factivity) axiom schema Kef) => <£> seems acceptable 
according to the naive interpretation of K when written in the language of modal logic. 
However, for instances when <f> itself contains modal connectives, the naive interpretation 
of the sentence “If </> is known, then it is true” is plainly wrong. The reason is that (f> 
talks about different objects of the universe (namely, different worlds) when evaluated at 
different worlds; the sentence should be read as “If </> is known, then a certain formula 
related to </> is true”. The same goes for the (Knowability Principle) <f> => <0 Kcf>-, it is 
fine to interpret it as “Every truth is knowable” only if cf> contains no modal connectives. 
Otherwise, one must say the more verbose “If <f) is true, then a certain formula related to 
cf) is knowable.” This point is similar to the criticism given by Kvanvig in a number of 
papers beginning with [Kva95] (and later an entire book!). 

Of course we did not need to translate to MSFOL to make the criticism of the previous 
two paragraphs. We could have just used the language of frame semantics. But that is 
missing the point. The criticism explained why the axioms seemed reasonable, which is a 
requirement for a strong rebuttal in philosophy, and I have demonstrated that it is not 
hard to do this in MSFOL. But my position -the position of the system advocated in 
this thesis- is that this is asking too much of a critic. My work in the criticism should 
have ended much earlier, at the point just before considering the two cases of whether or 
not there are intended interpretations that falsify V«t, w%. Rk{w\, wq). At that point, I 
should simply make a semantics criticism with either the (iF-Factivity) or (Knowability 
Principle) instance and one of the symbols Rk, Rjj, or W. The virtue of insisting on 
writing the axioms in MSFOL is just this: it forces the author of the proof to reveal the 
complexity of their axioms, rather than putting that burden on the critic. 

2.4 Implementation for reading and verifying interpreted 
formal proofs 

A good approximation of the format in which I intend interpreted formal proofs to be 
read -in an effort to make reading them less effortful and tedious- can be seen in any of 
these examples: 
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1. Sue Rodriguez’s case at the Supreme Court of Canada: http: //www. cs .toronto. 
edu/~wehr/thesis/sue_rodriguez.html 

2. Assisted suicide should be legalized in Canada: http://www.cs.toronto.edu/ 
~wehr/thesis/assisted_suicide_msfol.html 

3. Walton’s intentionally-fallacious argument that no one should get married: http: 
//www.cs.toronto.edu/~wehr/thesis/walton_marriage.html 

4. Infinitely-many primes: http: //www. cs. toronto. edu/~wehr/thesis/inf initely-many. 
primes.html 

5. 5 color theorem: http://www.cs.toronto.edu/~wehr/thesis/5colortheorem. 
html 

6. High-level proof of Godel’s Second Incompleteness Theorem: http://www.es. 
toronto.edu/~wehr/thesis/G2b.html 

An interpreted formal proof is implemented as an HTML document with the following 
structure: 

• A sequence of declarations, each of which is a new symbol introduction (with 
definitions being a special case), axiom, or lemma. An axiom is either an Assumption, 
Simplifying Assumption, Assertion (intended to be uncontroversial), Claim (author 
is a prepared to prove, once challenged), or Quasi-definition (a symbol introduction 
together with an axiom that is not technically a syntactic definition, but plays a 
definition-like role). 

• The statement of a lemma or theorem A, the goal of the interpreted formal proof, 
which uses only the previously-introduced symbols. 

• A collapsible proof of A, which is another interpreted formal proof whose immediate- 
child declarations, combined with those that preceded the statement of A, entail A, 
where the entailment is verified by a first-order theorem prover 18 (see below for more 
detail). Note that this is slightly atypical in that one may delay introducing axioms 
and primitive symbols until just before they are used in a proof. The purpose of this 
is to lessen the effect of an interpreted formal proof starting with an overwhelming 
number of symbol introductions and axioms before it even gets to the statement of 
the goal sentence. Instead, the declarations that must precede the statement of a 
lemma are just the symbol introductions for the symbols that are explicitly used in 
the lemma. 

18 “preceded” means the declaration is in its scope, where scope is defined as in many programming 
languages. For example, if a lemma A 2 immediately follows a lemma A±, then the proof of A 2 can use 
A\ but nothing introduced in the proof of A\, and the proof of A\ cannot use A 2 . 
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• Each lemma is the goal of an interpreted formal proof or else it has an informal 
natural language proof. 

Such HTML documents will be a central part of the web system described in Section 
9.1. The main advantages over reading in DT^X/PDF are these: 

• Collapsible sections of text (implemented). This is helpful as an author if you 
want to hide nasty parts of the proof by default, and for readers it is helpful for 
decluttering the screen once they are satisfied with a proof/justification of some 
lemma/claim, or once they have sufficiently-internalized the syntactic definition or 
informal semantic description of a symbol. 

• Pop-up references on cursor hover (implemented). Hovering the mouse cursor over 
occurrences of symbols will reveal information from their initial declaration. This is 
more useful for proofs about socially-relevant issues than it usually would be for 
proofs in mathematics, because of the much higher ratio of 

number of fundamental symbols with no standard meaning 

length of proof 

• Reader comments (implemented): As a temporary stand-in for the plans of Section 
9.1, readers can attach annotations (e.g. for criticisms) to any part of an argument 
(via AnnotateIt.org), or post (nested) comments at the bottom of the page (via 
DISQUS.com). 

• Renameable symbols (in the works). If the name chosen by the author is not 
conducive to your reading, then change it! 

• Multiline display of formulas (in the works). As in some programming languages, 
there will be a standard format, in terms of where white space is placed, for 
displaying the structure of formulas across multiple lines, so that an author need 
only indicate with a checkbox whether the children of a subterm should be displayed 
on different lines. 

The displayed syntax need not be the same as the input syntax. In particular, two 
distinct symbols can display the same way. Hovering the cursor over an occurrence will 
reveal which version it is. This takes care of most of the use cases for overloading (where 
e.g. a function symbol can have multiple function types) while remaining in standard 
many-sorted first-order logic. 

Most of the examples listed above are written in a formal language 19 , which gets 
translated to HTML and to instances of theorem proving problems in many-sorted first 


19 The exceptions are examples 5 and 6, which I wrote before implementing the formal language. 
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order logic, specifically the TFF (typed first-order formula) language of TPTP[Sut09]. 
Those problems can be solved automatically by first-order theorem provers, although 
when there are a very large number of axioms and definitions, as in the Assisted Suicide 
argument (Chapter 8), it is sometimes necessary to tell the prover which axioms to use 
for each lemma. 20 I used SNARK[SWC00] for type checking and sometimes short proofs, 
CVC4[BCD + 11] for model finding and sometimes proof search, and Vampire[KV13] for fast 
proof search and sometimes countermodels, all via the System on TPTP web interface 21 
(except Vampire was also easy to setup locally). 

2.5 Toy example: Walton’s fallacious argument demon¬ 
strating equivocation via “variability of strictness 
of standards” 

This example is also available for reading in HTML: 

http://www.cs.toronto.edu/~wehr/thesis/walton_marriage.html 

Here is the informal argument, verbatim from [Wal08]: 

1. Getting married involves promising to five with a person for the rest of your life. 

2. Nobody can safely predict compatibility with another person for life. 

3. One should not make a promise unless one can safely predict that one will keep it. 

4. If two people aren’t compatible, they can’t live together. 

5. One should not promise to do something one cannot do. 

6. Therefore, nobody should ever get married. 

Lines 1-4 of the informal argument correspond to Assumptions 1-3 below. Line 5 is 
redundant, and line 6 corresponds to the proved conclusion, Proposition 1 below. 

Sorts: 

• P is for people. 

• D for dates (e.g. 1 Sept 1998). 

• A for potential actions that are associated with a particular date, but not a particular 
person (like verbs). 

20 This is due to the non-goal-directed nature of the saturation-based first-order theorem provers that I 
have used; it is possible that a backwards theorem prover, perhaps even a cut-free proof search, would 
work better in such cases, but I have not yet found a good, easy to set up implementation. 

21 http://www.cs.miami.edu/~tptp/cgi-bin/SystemOnTPTP 
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• T contains a subset of the (formula, object assignment) pairs (see Definitions 1 and 
2). The intended interpretation of this symbol would be simpler if we introduced 
constant symbols for each element of P and D, in which case I could just identify 
T with a particular finite set of sentences (of the form of the formulas on the right 


side of <-> in Definitions 1 and 2). 


Predicate symbols: 


Does 


LiveWith 

P x P x D x D 

Holds 

T -»• B 

Compatible 

P x P x D x D 

CanSafelyPredict 

Pxl'xD-*® 

ShouldNotDo 

P x A — B 




Function symbols: 


getMarriedTo 

makePromise 

liveWithTillDeath 

compatibleTillDeath 

dateOfDeath 

min 


P x D ^ A 
T x D — A 
P x P x D —► T 

P — D 
Dx D ^ D 


Style notes: 

• The following variables are reserved for the following types: d for D, a for A, p and 
q for P, and ip for T. Similarly for the primed and subscripted versions of those 
variables. 

• I leave out leading universal quantifiers. 

• To improve readability, when a function symbol takes arguments of type P, I put 
the arguments in the subscript, as in Does p (a), and when a function symbol takes 
one or more arguments of type D , I put them in the superscript, as in LiveWithp d'. 

Formalization notes; 

• It is not hard to correct the argument for the objection that it clearly doesn’t 
work when p and q are near death, in which case it’s especially reasonable to reject 
Assumption 1. I haven’t done so since the argument has other, more-serious flaws. 
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• If after reading the next bullet list about the informal semantics, you would, like 
me, still reject Assumption 2 for being too broad, then move the sentence to the 
position of a premise of the goal sentence (Equation 1). 

• In retrospect, it would have been more-economical to make liveWitliTillDeath(( q 
and compatibleTillDeathp primitive instead of LiveWith^ and Compatible)^ , 
but the way I’ve done it is more faithful to Walton’s presentation. 

Here is a sketch of part of the informal semantics (i.e. a language interpretation guide): 

• LiveWithp d! iff p and q are both alive and share the same main residence during 
the period from d to d!. 

• makePromise d (^) means to make an utterance, on date d, like “I promise that A”, 
which is directed at someone with the intention of their interpreting it as a sincere 
and literal statement. 

• The semantics of compatibleTillDeathp and liveWithTillDeathp are essentially 
determined by the semantics of the other symbols by Definitions 1 and 2. 

• The informal semantics for the other symbols (except for Compatible and CanSafelyPredict, 
which correspond to the terms in the informal argument that are used with varying 
“strictness of standards”) are not surprising and not hard to flesh out. 

Definition 1. The year when the first of p or q dies. 

firstDeathpq := min( dateOfDeath p , dateOfDeath g ) 


Quasi-Definition 1 . For all p, q. d there is a proposition liveWithTillDeathp : T that 
holds iff p and q live together from the date d until one of them dies. 

Vp, q, d. Holds(liveWithTillDeath^) <-> LiveWithJf stDeath ™ 

Quasi-Definition 2. For all p, q, d there is a proposition compatibleTillDeath)) q : T that 
holds iff p and q are compatible from the date d until one of them dies. 

Vp, q , d. Holds (compatibleTillDeathp 9 ) <-> Compatiblep^ rstDeathp ’ 9 

Assertion 1 . If p marries q (on date d), then p makes the promise (on date d) that they 
will live together until one of them dies. 


DoeSp(getMarriedTOg) => Does p (makePromise d (liveWitliTillDeatli(( f/ )) 
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Assumption 1 . Roughly: You can’t safely predict that two people will be compatible 
till death. More precisely: No person p', on any date, can safely predict that two people 
p and q will be compatible from that date until one of their deaths. 

-■CanSafelyPredictp, (compatibleTillDeathp g ) 

Assumption 2. If p cannot (on date d) safely predict that <p will be true, then p should 
not (on date d) promise that <fi will be true. 22 

-'CanSafelyPredictp(^) => ShouldNotDo p (makePromise d (^)) 

Assumption 3. Two people who are incompatible during a period cannot live together 
during that period. 

-Compatible"' => -LiveWith"' 

Assertion 2. If p (on date d) can safely predict that xp will hold, and xp implies xp', then 
p (on date d) can safely predict that xp’ will hold. 

CanSafelyPredictp(^) a (Holds(^) =^> Holds(^/)) => CanSafelyPredict^^') 

Assertion 3. If p should not do action a, and doing a' requires doing a, then p should 
not do a'. 


ShouldNotDop(a) a (Does p (a') => Does p (a)) => ShouldNotDo p (a / ) 


The axioms prove the following, which is the goal sentence: 

Proposition 1 . ShouldNotDo p (getMarriedTo q ) 

Proof. Let p, q, d be arbitrary. Assertions 1 and 3 imply 

ShouldNotDo p (makePromise d (liveWithTillDeathp q )) => ShouldNotDo p (getMarriedTo(|) 
Hence it suffices to prove 

ShouldN otDop (makePromise d (liveWithT illDeath p q )) 

22 There is a sense in which it would be more technically correct to write “is true” at both places where 
I wrote “will be true”, since the truth value of a sentence does not depend on time, but on the other hand 
“will be true” is consistent with common usage of English, where one can say “I think A will be true” 
in order to convey the meaning “I think A is true, but we won’t know for sure until some point in the 
future.” 
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An instance of Assumption 2 is: 

^CanSafelyPredictp(liveWithTillDeathp ? ) => SliouldNotDo ; ,(makeProinise ri (liveWithTillDeath^ f/ )) 


And so it suffices to prove 

-> CanSafelyPredictp (liveW ithTillDeath^ ) (2.1) 

An instance of Assumption 1 gives: 

—■ CanSafelyPredictp(compatibleTillDeathp 9 ) (2.2) 

Using Definitions 1, 2 and Axiom 3 we can derive: 

Holds(liveWithTillDeathp g) => Holds(compatibleTillDeatli)) f/ ) (2.3) 

Finally, from (2.2), (2.3) and Assertion 2, (2.1) follows. □ 


Assume that the issues I mentioned above under “Formalization notes” have been dealt 
with. Provided the audience takes the marriage vows seriously (e.g. imagine they are all 
devout Catholics), you should be able to fill in the language interpretation guide, starting 
from the sketch I gave above, in such a way that it would be hard for any audience 
member to reject any of the axioms except for exactly one of Assumptions 1 or 3, and 
possibly Assumption 2. 

Let’s suppose that we accept Assumption 2. So we focus on Assumptions 1 and 3: 
Assumption 1: -^CanSafelyPredict^compatibleTillDeathp^) 

Assumption 3: ->Compatible^ => -'LiveWith^ 

Furthermore, suppose that I raise a semantics criticism against the symbol CanSafelyPredict, 
and that our dialogue rules allow me, as a critic, to suggest an extension of the language 
and axioms. I suggest the introduction of a predicate symbol for personal probability 
assessment on a particular date, 23 together with the new sort symbol [0,1] for the real 
interval [0, l] 24 : 

Prob :Px$xDx[0,l] 

23 Similar to Bayesian probability, with the intended semantics given in terms of betting games, but 
without the convention that a probability is assigned to every proposition. 

24 0r the rationals between [0,1], or even a finite set such as {0, .01, .02,..., .98, .99,1} would suffice. 
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The intended semantics is for Prob))(</>, x) to mean that on date d, person p thinks (p is or 
will be true with probability at least x. And I suggest the axiom: 25 

CanSafelyPredict))(</>) => Prob))(</>, .8) 

With that in place, it would be harder for the author of the proof to equivocate about 
the meaning of CanSafelyPredict. Now, the only symbol whose intended semantics is 
too vague (whose definition is too incomplete) for us to evaluate Assumptions 1 and 3 is 
Compatible))’)) . And that brings us to the serious flaw in the informal argument argument. 
For Assumption 1 to be true under a given interpretation, the semantic definition of 
Compatible))))) needs to be fairly strong, but for Assumption 3 to be true under a given 
interpretation, the semantic definition of Compatible))’)) needs to be quite weak (meaning 
its extension is large). For example, I would not be abusing the dialog rules if I rejected 
Assumption 3 for any definition of Compatible))’)) that is much stronger than this: 

p and q are compatible during [d. d'] unless one of them poses a physical 
danger to the other, or one of them makes an effective legal action to remove 
the other from the household. 

And, for such a weak definition of Compatible))^, I would have no difficulty justifying my 
rejection of Assumption 1. 

2.5.1 Formal criticism of Walton’s marriage argument 

Recall from Section 2.5 that compatibleTillDeath is defined in terms of Compatible and 
some symbols whose descriptions are clear. The three disputable assumptions were: 

Assumption 4. No person p', on any date, can safely predict that two people p and q 
will be compatible from that date until one of their deaths. 

-■CanSafelyPredict))/ (compatibleTillDeath)) ) 

Assumption 5. If p cannot (on date d) safely predict that <p will be true, then p should 
not (on date d) make a promise that <p will be true. 

-•CanSafelyPredict))^) => ShouldNotDo p (makePromise d (^)) 


25 If you change .8 to a value much closer to 1, then Assumption 2 becomes easy to dispute. 
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Assumption 6. Two people who are incompatible during a period cannot live together 
during that period. 

-Compatible"' => -LiveWith"' 

The initial semantic description (language interpretation guide entry) entry for 
CanSafelyPredict (which is empty, so the only hint we had for interpreting the symbol 
was the name of the symbol) is too vague for me to evaluate Assumption 5. Specifically, 
how much confidence must p have in xp’s truth in order to “safely predict” p? We can 
say the same for Assumption 4 with either symbol Compatible or CanSafelyPredict, or 
for Assumption 6 with symbol Compatible, although we should prefer Axioms 5 or 6 
since each depends on only one too-vague symbol. 26 But supposing I choose Assumption 
5, then by the definition of criticizing an interpreted formal proof (Section 2.2) there 
seems to be only one productive thing to do, which is to make the semantics criticism 
(Assumption 5, CanSafelyPredict). I will then communicate with the author directly, 
suggesting they change the semantic description of CanSafelyPredict to something like “If 
p can safely predict ip on a given date d (i.e. CanSafelyPredict j (d)), then on that date p 
has credence at least X that ip is or will be true,” where X is some fixed constant. 

Of course the author may reject that suggestion, and instead, for example, add prose to 
the semantic description of CanSafelyPredict that, being still too vague, does not actually 
help me interpret CanSafelyPredict well enough to evaluate Assumption 5. In that case, 
I will introduce some new symbols, which are under my control, along with an axiom 
A, also under my control, that uses the new and old symbols and expresses the above 
Bayesian interpretation of CanSafelyPredict. The author can then accept (unlikely, given 
the previous failure using informal communication), weakly reject, or strongly reject A. 
This formalizes our disagreement about the meaning of CanSafelyPredict, and documents 
it for later readers of the argument. Suppose the author accepts my suggestion, say for 
X = .9. Then I can accept Assumption 5. 

The author’s semantic description of Compatible is still too vague for me to evaluate 
the other two axioms. Once again, according to the definition of criticizing an interpreted 
formal proof, it seems the only productive thing for me to do is make the semantics 
criticism (Assumption 6, Compatible) or (Assumption 4, Compatible). 

As concluded in Section 2.5, if the author clarifies the semantics of Compatible and it 
is very weak (its extension is large), then I can make a subjective criticism of Assumption 

26 This can be useful if informal communication fails, since one can more-easily use the author’s 
acceptance of the axiom to deduce constraints on the meaning of the too-vague symbol. In particular, 
one can sometimes, for the sake of making a criticism, simplify an axiom by partially evaluating it using 
the parts of the author’s language interpretation guide that are sufficiently precise. 
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4 and strongly reject it 27 , and if the author clarifies the semantics of Compatible and it 
is (at least) moderately-strong, then I can make a subjective criticism of Assumption 6 
and strongly reject it. Finally, if the author makes his intended semantics for Compatible 
somewhere between “very weak” and “moderately strong”, then I can reject both of 
Assumptions 4 and 6. Any of those three scenarios would be good places to end the 
dialogue. 

More likely (in this scenario with such an uncooperative author), the author would 
see the vulnerability, and avoid clarifying the semantics of Compatible enough that I 
can make a semantics criticism. In that case, I would formalize the idea of the previous 
paragraph in the following way. I introduce predicates 

GetAlongOk : PxPxDxD— 

Murderous : PxPxDxD^ B 

My language interpretation guide entry for GetAlongOk)))^ 2 says that p and q get along 
OK during the period [d\ , d 2 ], and the entry for Murderous)) 1 ^ 2 says that p and q will try 
to kill each other if they come into contact during [di,d 2 ]- Additionally, I introduce two 
defined 0-ary predicate symbols in order to give names to two sentences: 

Compat Very Weak := dp.q.di.d^. -Murderous)) 1 ^ 2 => Compatible)));* 
CompatModeratelyStrong := dp. q. di, d- 2 - Compatible)) 1 ^ 2 => GetAlongOk)) 1 ^ 2 

Finally, I introduce the following axioms (which I accept), which formally describe 
my above stated positions on the author’s two remaining controversial assumptions for 
a range of possible precisifications of Compatible (since the author has not made his 
intended semantics precise). These axioms imply that I reject at least one of those two 
assumptions: 

-■Compat Very Weak v -■CompatModeratelyStrong 
Compat Very Weak => (Assump 6 a -■(Assump 4)) 
CompatModeratelyStrong => (Assump 4 a --(Assump 6)) 
(-■CompatModeratelyStrong a --Compat Very Weak) =^> (--(Assump 4) a --(Assump 6)) 


27 Meaning all of my personal interpretations of the language falsify the axiom. Note that only requires 
that all of my intended interpretations have at least one tuple (p',p, q. d) for which the formula is false. 
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2.5.2 Literal, ungenerous interpretation of (non-simplifying) as¬ 
sumptions 

There is another problematic axiom in argument, Assumption 6, that is easily fixable 
and which according to the directions for criticizing an interpreted formal proof, should 
be criticized even if the critic knows it is fixable. One simple acceptable way for the 
author to respond is by changing the label of the axiom from Assumption to Simplifying 
Assumption. 

Instances of Assumption 6 for which d and d' are close should be weakly rejected. 
Even for a weak definition of compatible (but not quite as weak as Compat Very Weak), 
there would exist two people who are to that extent strongly incompatible and yet manage 
to live together for a few days. The author should address the criticism, and here are two 
quick ways of doing so according to the rules: 

1. Make Assumption 6 a Simplifying Assumption, and amend the natural language 
text associated with it to describe the sense in which it is a simplifying assumption. 

2. Add a hypothesis such as d + 365 < d! to Assumption 6 (introducing sort N and 
symbols + : D x N —> D and 365 : N 28 ), yielding: 


d + 365 < d' a -’Compatible)^ 


-LiveWithJf 


With option 2, we can also use the new symbols to formalize an assumption that has the 
same purpose as the informal constraint on the sort P for people that says P includes 
only people who are not near death (given in its semantic description in Section 2.5), as 
follows: 


Simplifying Assumption 1 . When two people get married, they both live for at least 
a year after. 

DoeSp(getMarriedTo^) => d + 365 V hrstDeath Pjg 


Now, what was the point of this pedantry? Essentially, it is the application of a safety 
principle. The method of formal deduction and criticism with interpreted formal proofs 
that I advocate may not be robust unless a critic can insist on having technical problems 
fixed without having to justify why it is important to do so. Otherwise, disputes about 
meaningful matters will sometimes devolve into unending arguments about argumentation 
itself. In Leibniz’s words: 

28 3 65 is larger than necessary, but it is not important. 
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...Then there will be an end to that burdensome raising of objections by which 
one person now usually plagues another and which turns so many away from 
the desire to reason. When one person argues, namely, his opponent, instead 
of examining his argument, answers generally, thus, “How do you know that 
your reason is any truer than mine? What criterion of truth have you?” 

(Gottfried Leibniz, 1679, “On the General Characteristic”[LL76]) 


Chapter 3 

Classical deductive formalization of 
defeasible reasoning 


The purpose of this chapter is twofold. First, to share the high-level ideas of some reusable 
formalization patterns that I have used in the course of writing examples. Second, as an 
extension of Section 2.3 that addresses, by example, objections against the foundations 
of this project along the lines of deduction being inappropriate for real-world reasoning. 
The following quote from [Gor88] is an example of such an objection. I have inserted 
numbers (n) for the purpose of commenting following the quote. 

Standard propositional and predicate logics are monotonic. That is, if a 
proposition is logically implied by some set of propositions, then it is also 
implied by every superset of the initial set. (0) Another way of describing 
monotonicity is to say that once something is determined to be true, it remains 
true. (1) No additional information can cause conclusions to be modified or 
withdrawn. (2) There is no way to presume something to be the case until 
there is information to the contrary. (3) There are no rules of thumb, or 
general rules, which allow conclusions to be drawn which may be faulty, but 
are nonetheless better than indecision. (4) Classical logic offers no theory 
about when to prefer one belief to another in general, and provides no language 
for stating which beliefs to prefer given that certain things are known in a 
particular case. 

(5) The subject matter of classical logic is truth, not decision making. The 
central concern of logic is logical consequence: which propositions are nec¬ 
essarily true given that other propositions are true. (6) Monotonic logic is 
very useful when we want to know what must be the case if something else is 


27 


Chapter 3. Classical deductive formalization of defeasible reasoning28 


known to be true. It is less useful when we know very little about some domain 
with certainty, or can discover the facts only by extending resources, if at all. 

(7) Monotonic logic alone provides us with an infinite number of conditional 
statements of the form "this would be true if that is true ", which is of little 
help in making decisions when we are unable to establish with certainty the 
truth or falsity of the alternative premises. [Gor88] 

Already at (0) we have a classic indicator of problems to come: reference to unqualified, 
non-relative truth is often not meaningful in formal logic, and definitely is not meaningful 
for classical FOL, which is defined in terms of truth-with-respect-to-structures. This 
ambiguous use of “truth” leads to all sorts of confusion and equivocation, and should be 
banished whenever one is debating the merits of one logic over another. 

Points (l)-(4) are a straw man argument. The author implicitly conflates the use of 
classical logic via first-order theories with the definition of classical FOL ; the former is 
the author’s desired target, and the latter is the straw man. The fact is that none of 
these complaints about the literal, technical definition of classical predicate logic apply to 
first-order theories, which are what the author should really be attacking. 

Points (5)-(7) are an innocent, understandable, and common oversimplification that I 
believe leads to an unfortunate misconception about the scope of classical predicate logic, 
or even formal logic in general, especially among people unfamiliar with formal logic. (5): 
It would be more accurate to say that the subject matter of classical predicate logic is 
semantics , with relative-truth being an important special case. Perhaps this is easier seen 
in the formulations of MSFOL that treat the set of truth values as just another sort, so 
that the boolean connectives are just very common function symbols. (6): It is true that 
when there is a great deal of uncertainty in a domain, those boolean function symbols 
are used a little less, with function symbols for Bayesian reasoning having a larger role, 
but that is hardly a criticism of classical logic. (7): In fact conditionals remain just as 
essential in domains with a lot of uncertainty. They are our main tool for excluding from 
consideration the structures that we are not interested in reasoning about, and they are 
just as useful when those structures contain e.g. Bayesian distributions, interpretations of 
defeasible legal statutes, etc. 


Chapter 3. Classical deductive formalization of defeasible reasoning29 


3.1 Argument from expert opinion 

3.2 Bayesian reasoning 

This section is about how to criticize arguments that use Bayesian reasoning. I use the 
phrase “subjective probability assumptions” to refer to the informal class of assumptions 
that includes priors, bounds on conditional probabilities, and choices of parametric models. 

Bayesian reasoning and statistics feature prominently in several of my major examples. 
The literature on the problem of interpreting Bayesian subjective probability assumptions, 
and the (sometimes insubstantial) Bayesian vs Frequentist debate, is vast (see [F + ll] and 
[Efr05] for refreshingly concrete and pragmatic perspectives). Without surveying all the 
motivations and interest in the interpretation problem, let us make more precise why it is 
a problem for this project that cannot be easily dismissed. 

There are subjective assumptions that seem normal and obviously necessary, such as 
some of the assumptions I make in my arguments about assisted suicide, and generally 
the kind of assumptions one must make in order to derive anything with nontrivial ethical 
ramifications. And then there are subjective probabilities, which make most of us at least 
a little uneasy. If you say that your prior for a murder suspect’s guilt is x, what recourse 
do I have if I think that prior is unreasonable? And how do I make precise what it is that 
I am disagreeing with? Interpretations involving betting ratios/dutch books[Tall3] help 
with the latter problem, but not obviously with the former. 

I claim that the origin of this uneasiness is the same as that caused by using explicit 
real-valued utility functions to reason about people’s subjective values. I handle those 
cases in the same way. The issue in both cases is with the mixture of qualitative and 
quantitative subjective assumptions. 

To be more concrete, let’s look at an example of a typical Bayesian probability 
assumption, from a famous case in England in which a criminal defense team was allowed 
to have a statistics expert present a Bayesian analysis to a jury - one of only a few times 
this has ever been allowed in a jury case[Don05][Kad08]. Most of the details of the case 
are not important for our purposes; a woman was assaulted by a man, the assailant left 
DNA evidence, and years later a man was prosecuted, and ultimately convicted, after a 
“cold hit” 1 on London’s DNA database made him the suspect. Please note that we won’t 
be focusing on the DNA aspect of the case, which is the most interesting and contentious 
part (see [SPMS09], [DF99]; later on I will release an interpreted formal proof about 
this case). The only reason I mention the DNA aspect is for point 1 on page 31. The 


1 Meaning the man’s DNA was run through the DNA database before he was a suspect. 
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probability assumption we are focusing on is the instantiation (or bounding, as I would 
prefer) of two of the defense team’s fundamental parameters involved in estimating the 
likelihood ratio: 


Pr(evidence 

suspect guilty) 

Pr(evidence 

suspect innocent 2 ) 


Namely these two: 

Ci = Pr (victim failed to identify suspect in police lineup | suspect guilty) 

C 2 = Pr(victim failed to identify suspect in police lineup | suspect innocent) 

Now suppose I make assumptions that constrain the parameters of the argument, including 
those two, enough to imply that the likelihood ratio is large, suggesting -contrary to 
my opponent’s intuition, let’s assume- that the suspect is guilty. One contributor to 
that numeric result is that I make an assumption that implies c 2 /ci is upperbounded 
by a particular number close to 1, so that the failure to identify the suspect cannot 
contribute much to making the likelihood ratio large. My opponent will want to reject 
that assumption, believing that the ratio should be much larger than 1. 

Moreover, suppose my opponent believes that my assumption about c 2 /ci is unrea¬ 
sonable. Since these are probabilities representing my credence about whether certain 
unrepeatable events happened in the past, it is hard for my opponent to argue that I am 
being unreasonable in an objective sense (this is the crux of the Bayesian interpretation 
problem). Some Bayesians will answer that my assumption is not unreasonable provided 
it is consistent with my other probability assumptions and standard probability axioms. 
Ultimately I agree with that position, and I offer no magic solution that would enable us 
to resolve situations of fundamental uncertainty. However, in the context of this project, 
there is something more I can do to at least state my position in a clear way. 

Before demonstrating my recommendation, I’ll briefly remind the reader how the 
clarification of a position is usually done in Bayesian reasoning. One introduces additional 
random variables representing evidence or environmental factors, with their own (usually 
brief) informal interpretations. Then one introduces new independence and conditional 
probability assumptions to derive a bound on the new likelihood ratio (new because now 
the meaning of evidence has changed). This is indeed the right thing to do when one 
can precisely describe the domains of the new random variables, and when one has some 
reason to be confident in the new probability assumptions. But those conditions are often 
hard to meet, and when they are not met, the additions may introduce more noise and 


2 Here “innocent” means factually innocent, as opposed to legally not guilty. 
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obfuscation into the argument than they do clarification. Consider, for example, trying to 
formalize a model of the victim forgetting the look of the assailant over time (over a year 
passed between the crime and the lineup), which is complicated by the dependence on 
both the look of the (unknown!) assailant and the victim’s general ability to recall faces. 

My recommendation involves adding much more detail to the informal semantics, only 
adding detail to the formal mathematical model once agreement has been established 
for the informal semantics. Suppose I formulate my assumption about the ratio c 2 /ci as 
follows. I make two assumptions: 

Al: We can model all our knowledge that is significantly relevant to c 2 /ci with the 
following thought experiment, which consists of two scenarios. First, I specify large, 
precise sets of white men M and women W. In scenario 1, random members m of 
M and re of IF appear in the exact locations of the assailant and victim in the real 
crime, under similar lighting and weather conditions. Time resumes and a crime 
may or may not happen, depending on rn and w. Any instances of the thought 
experiment in which events do not transpire in a way sufficiently similar to how 
they did with the actual crime are ignored, w is shown a police lineup containing m 
(after a time delay equal to the one in reality), where the other men in the lineup 
are selected from M — m in a fixed manner (which I would explain in detail) that is 
similar to how they were selected in reality (roughly based on looking similar to m). 
Then c, is the probability that w fails to identify rn. Scenario 2 is the same except 
that two distinct random members m, m' of M are sampled, with m! representing 
an innocent suspect, and w is shown a lineup containing m! instead of rn. Then c 2 
is the probability that w fails to identify rn'. 

A2: The second assumption says that c 2 /ci is upperbounded by a particular number 
close to 1. 

We cannot perform such an experiment, since doing so would be unethical and impractical. 
Although that verbose pair of assumptions does not get us significantly closer to an 
objective test, it does get us significantly closer to an objective description. The motivation 
is to be more precise, and a few observations about the thought experiment will convey 
that: 

1. The innocent suspect and the assailant are sampled almost-independently of each 
other (except for the constraint that they are distinct). This is actually a significant 
modeling assumption! Because, even if we assume the suspect Adams is innocent, 
we know his DNA test profile is the same as the assailant’s - and it is plausible that 
such genetic similarity makes two men significantly more likely to look significantly 
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similar to each other. 3 

2. Assuming W and M are large and varied, I do not use any of the (mostly noisy) 
knowledge that we have of the victim and suspect. This amounts to an implicit 
assumption that such knowledge is irrelevant. 

3. I have fixed a method of constructing police lineups. In my opponent’s favour, I 
have made no assumption that the lineup was flawed in a way that would make the 
victim less likely to select a guilty suspect and/or more likely to select an innocent 
suspect. 

4. We cannot test my second assumption A2, but A2 does have the virtue of being 
a relatively-objective assumption (relative to how precisely I specify the thought 
experiment). I claim this is valuable. 

If the thought experiment is actually carried out, then this reasoning strategy is essen¬ 
tially Empirical Bay es[Efr05] 4 . Note the structural/qualitative character of assumption 
Al. It amounts to my modeling assumptions. In contrast A2 has a numeric character. 
This fits my general recommendation: split a subjective probability assumption into two 
parts: (Al) a subjective, qualitative part, and (A2) an objective, quantitative part. Doing 
that separation in a way that makes Al acceptable to both sides of an argument often 
requires the description of elaborate (but sufficiently precise) thought experiments that 
will never be carried out, so I should reiterate that this is not a strategy for directly 
resolving disputes. 

So what is gained? We can now move forward with our disagreement on C 2 /C 1 , either 
by coming to agreement on an Al-type assumption, or by failing to do so. If we fail to 
agree on Al, then the quantitative part of the probability assumption was likely obscuring 
the fundamental source of disagreement more than anything it was contributing. If we 
agree on Al, then we are left with two (inconsistent) versions of A 2-type assumptions, 
which are as uncertain as the original probability assumption, but are more objective and 
precise, and thus easier for other people to judge for themselves. 

3.3 Theory comparison 

The comparison of theories (or models, explanations of evidence) is a broad category of 
defeasible reasoning in the physical and social sciences, as well as in criminal law. The 

3 We already know the genetic similarity makes them at least a little more likely to look similar to 
each other because. For example, siblings are much more likely to have identical DNA test profiles, and 
two men with identical DNA profiles have the same race[LCMJ10]. 

4 See reference. In short, this is the approach of using loosely-related data to set or constrain priors 
and other parameters, under the assumption that the data is not chosen in a biased way. 
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Leighton Hay argument (Chapter 6) and the smoking-causes-cancer argument (Chapter 
7) are in this category. Both have the following form: 

(a) Specification of one’s desired consequence of a comparison being sufficiently-favourable 
to one’s preferred model. For example, that some scientific theory should be aban¬ 
doned, that some public policy should be put into place, or that some person accused 
of a crime should be declared guilty. 

(b) Formal definition of when one theory is better than another (or “much better” as in 
the smoking-causes-cancer argument; the strength needed depends on how much 
force part (a) requires of the comparison), together with a proposition that part (d) 
suffices to justify part (a). 

(c) Formalization of the two or more competing theories. 

(d) Deductive proof that one theory is better (or “much better”) than the others, 
according to the definition. 

(c) is defeasible in that the formalization of a theory might be unfaithful to the intent 
of the proponents of the theory, a kind of straw man argument, (b) may be unfair 
(bias), untrustworthy (variance/bias), or otherwise inappropriate. One type of potential 
flaw, which is warned about especially often in discussion of the limitations of Bayesian 
reasoning, is that (i) the definition is a reasonable one based on how well the theories 
explain or predict the available evidence, but not all evidence relevant to the decision 
(a) is included. In contrast, the concern that has received the most technical attention 
in statistics (both the Bayesian and frequentist schools) is whether (ii) the definition is 
reasonable for the available evidence. Finally, even when all the significantly-relevant 
evidence is considered and the comparison relation is reasonable, it may be that (iii) the 
comparison is too weak to justify (a). A critic could try to show any of those three types 
of flaws in the smoking-causes-cancer or Leighton Hay examples, though I believe types 
(i) and (iii) would be the most fruitful for them. See Sections 7.1 and 6.3 for some specific 
criticisms of the smoking-causes-cancer and Leighton Hay arguments, respectively. 

3.4 Costs/benefits analysis 

There is a common belief in the social sciences that mathematical/logical methods 
necessarily oversimplify social issues, and because of this they are inappropriate for 
reasoning about such issues. The first part is true, although it is equally true of natural 
language argumentation. My main concern here is to argue that the second part is not 
supported by the first, provided results are reported in a disciplined way. 
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The Sue Rodriguez argument (Chapter 4) and the physician-assisted suicide argument 
(Chapter 8) can both be construed as costs/benefits analyses. In both examples, only a 
subset of the apriori-relevant factors/concerns are considered. 

In the Sue Rodriguez argument this is very explicit, since all the goal sentence says 
is that a certain set of 4 concerns do not justify ruling against her. There remains the 
possibility that I excluded some concern that is highly relevant, in fact so relevant that 
with its inclusion there is a strong argument to be made for the negation of the goal 
sentence modified to include the 5th concern. It is tempting to consider my original 
4-concern argument invalidated by the hypothetical new 5-concern argument, however 
that is not technically correct if one treats the arguments in the same way as mathematics 
arguments; it is not necessarily inconsistent/unreasonable to accept both arguments. Here 
is an analogy from math: We are interested in whether a majority of elements of a Unite 
set S have a property P. On the way, we prove the answer is no if we replace S with a 
certain subset Si cr S, then later we prove the answer is yes if we replace S with a certain 
S 2 such that 5i c ^2 c S, and then no again for a certain S 3 such that Sj cr S 2 cr S :i cr S. 
There is no inconsistency, and moreover we are slowly getting closer to the truth about S. 

The assisted suicide argument is structured a little differently at its top level, in that 
its goal sentence is not relative to the simplification (the simplification being that only a 
subset of the apriori-relevant factors are considered). The argument’s Assumption 1 that 
Main Lemma 5 => (should pass) 6 implicitly includes the simplifying assumption that it 
is only necessary to consider the direct effects of the assisted suicide system on individual 
people, and not, for example, on the culture of Canadian society or on groups of people. 
If a critic goes on to argue the negation of the goal sentence, by including a strictly larger 
set of factors, but without assuming anything in conflict with the assumptions of my 
argument about individual people, then they would indeed need to reject Assumption 1 
in order to be consistent/reasonable. 

It would be very easy to modify the assisted suicide argument so that its goal sentence 
has the same relative form as the Sue Rodriguez goal sentence 7 , and vice-versa. Moreover, 
in any case, the goal sentence does not even give the meaning of the proof, which is 
properly given by A Ae Axioms A => goal sentence. Nonetheless, I think the difference in 
the two forms of goal sentences is important, if only because of the tendency for proofs to 
be reported in terms of their main conclusion, with the axioms left tacit. That practice 
needs to be actively discouraged to answer the oversimplification concern. Fortunately, 

5 You do not need to lookup what this is for the purpose of this discussion 

6 Which is a 0-ary predicate symbol that means legislation should be passed that introduces an assisted 
suicide system which is consistent with the constraints given in the argument. 

7 Basically just delete Assumption 1 and make the Main Lemma be the goal sentence 
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all that requires is familiarity with FOL and reading the proof document itself. The same 
cannot be said for defeasible logics, or logics that offer very simple syntax to express 
complex semantics (see Section 2.3). 

3.5 Counterfactual reasoning, hypothetical scenarios 

There is a great deal of literature giving general mathematical theories/systems (sometimes 
called “logics”) for modeling counterfactual reasoning, i.e. reasoning about what would 
have happened had something been different. Despite the abstract and sometimes-unreal 
nature of such hypothetical scenarios, we can still reason together deductively, as we are 
often able to describe such scenarios in such a way that our individual understandings 
are similar enough that the differences have no significant effect on the argument. 

The two arguments about the Berkeley gender bias case (Chapter 5) are examples of 
deductively-formalized counterfactual reasoning. They have (or can easily be put into) 
the form: 

(i) Formalize constraints that are (supposedly) sound with respect to an informal 
theory/explanation/model that one wishes to attack. 

(ii) Formalize constraints that one believes should be required to hold regardless of the 
theory. 

(iii) Deductively show that the constraints together are inconsistent. 

It is defeasible in that one’s type (i) constraints may misrepresent the informal theory, and 
in that one’s type (ii) constraints may be rejected by opponents. For example, both of the 
Berkeley arguments make the simplifying (and technically probably wrong) assumption 
that in the particular pools of applicants to each department, the males and females 
are equally qualified. That is a type (ii) constraint. The remainder of this section will 
examine those two arguments in more detail in relation to the itemized form above. 

In the first argument, I require a definition of gender prejudice that is weaker than 
financial forces, in the sense that if all means of discrimination had been removed, then 
there would not have been a large change in the total number of applicants accepted to 
each department. That is a type (ii) constraint. There is a type (i) constraint that the 
observed gender biases were caused by prejudice, so that when all means of discrimination 
are removed, each department should accept men and women at approximately the same 
rate. Finally, there is the similar type (i) constraint that if all means of discrimination 
had been removed, it should be possible (under the other constraints) that the overall 8 


8 Meaning across all departments. 
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acceptance rate for women relative to men improves. Part (iii) consists of proving the 
negation of that constraint, i.e. that the overall acceptance rate for women gets worse for 
any admissions round that satisfies the other constraints. 

The second argument can alternatively be construed as a theory comparison argument, 
but I’ll describe it in the above form. The main type (i) constraint is effectively that the 
test used to infer gender discrimination -whether there is a significant bias in favor of 
men in the overall acceptance rate- should not be sensitive to whether or not the genders 
apply to different departments at different rates. The main type (ii) assumption is that 
we can assess that sensitivity by considering an arbitrary pool of applicants of the same 
size in which men and women apply to each department at close to the same rate, and an 
arbitrary admissions round in which the gender-specific acceptance rates are close to the 
same as what they actually were (so if there was gender discrimination in reality, then 
there should still be gender discrimination in the hypothetical case). In more detail, the 
main type (i) constraint says that the numeric constraints on the hypothetical admission 
round should not be enough to guarantee that the test’s answer changes from “gender 
discrimination” to “no gender discrimination”; part (iii) consists of proving the negation 
of that constraint. 


3.6 Multiplicity of reasons 

The most defeasible of defeasible argument types has this form: some n reasons i?i,..., R n 
are given for a proposition G. It plays a central role in the Carneades system (see Section 
9.1.1), for example. Except for the qualitatively-different n = 1 case 9 it does not appear in 
any of the examples in this thesis, and nor should it; the opinion I advocate in this thesis 
is that, for the problems in the intended problem domain (Section 2.1), the argument 
form is inappropriate for anything but fast and speculative reasoning. 

It is tempting to try to deductively formalize such an argument as 

Simplifying Assumption: R\ a ... a R n => G 10 
Axiom 1: Ri 

Axiom n: R n 

9 i.e. the general use of conditionals, discussed on page 3. 

10 Recall from the beginning of this chapter that implication is interpreted classically, so that to accept 
this simplifying assumption (Section 2.2) means nothing more or less than that you are willing to exclude 
from your set of personal L -interpretations any interpretations that satisfy all the Ri but falsify A 
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However, that formalization is not faithful to the intended meaning of the defeasible 
argument. It is too fragile. In the framework of deduction, an effective criticism against 
any one of those n + 1 axioms is as good as an effective criticism against them all. In the 
framework of defeasible reasoning, in contrast, a criticism that effectively argues against 
only one of the Ri is regarded as only weakening the criticized argument. For fast and 
speculative reasoning, that is a good thing. But when there is adequate time available, 
the problem is serious, and it is hard to make progress, that principle of defeasible 
argumentation puts too great a burden on the critic. The burden should be on the 
argument’s author to formalize the sense in which the acceptability of each Ri contributes 
to the acceptability of G. u 

Nonetheless, the pattern from the previous paragraph can be useful when the axioms 
are properly interpreted deductively. For example, suppose that the Crown prosecutors of 
the Canadian government publish a high-level interpreted formal proof that some person 
is guilty of a murder. A good such argument will employ Bayesian reasoning in some 
places, but at the top level it could have the above propositional structure. Let’s say the 
goal sentence is a 0-ary predicate symbol G whose language interpretation guide says 
something along the lines of “the suspect is guilty”. The prosecution has evidence linking 
the murder weapon to the suspect, eye-witness evidence identifying the suspect at the 
scene of the crime, and DNA evidence of the suspect’s blood collected at the scene of 
the crime. We’ll make R\, R 2 , R 3 be 0-ary predicate symbols. The prosecution’s language 
interpretation guide entries for them are: 

• /?i: The murder weapon belonged to the suspect. 

• f? 2 : The eye-witness correctly saw the suspect fleeing the scene of the crime. 

• i? 3 : Blood belonging to the suspect was found 10 feet from the victim. 

Each Ri is a lemma proved from other assumptions, and they are connected to the goal 
sentence by the simplifying assumption f?i a f? 2 a R 3 => G. This is interesting, surprisingly, 
when we consider what it means for the prosecution to put forward such a simplifying 
assumption, and for the defense to accept it. From the prosecution’s perspective, it is 
useful since it simplifies their task to giving arguments for each of the Ri independently, 
but it is also risky since it introduces fragility to their argument - the defense only needs 
to argue against the weakest of the Ri. It is worthwhile for the prosecution if they strongly 
believe in R\, R 2 , R 3 , and have no other strong inculpatory evidence . From the defense’s 
perspective, the simplifying assumption is useful since it allows them to focus on refuting 
the weakest of the Ri, but it is also a concession, since it is possible that R\ , R 2 , R?, are 

11 Of course, a defeasible logic may provide some sophisticated schemas for formalizing that kind of 
relationship, but those can just as well be made into reusable first-order theories. 
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true and G is false 12 . It is worthwhile for the defense to accept the simplifying assumption 
if they think they can give a strong argument that at least one of the Ri is false, and they 
have no strong exculpatory evidence that isn’t related to the Ri. 


12 For example, it may be that the actual murderer was an associate of the suspect who had access to 
the suspect’s gun, both were at the scene of the crime (but the murderer was no seen by the witness), and 
the suspect was injured by the murderer while trying to defend the victim (hence the suspect’s blood). 



Chapter 4 

Example: Sue Rodriguez’s supreme 
court case 


This argument is meant to be read in a browser, and can be found at: 


http://www.cs.toronto.edu/~wehr/thesis/sue_rodriguez.html 

I include an inferior static version here just in case you have a printed copy and you 
strongly prefer to read on paper. 
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Introduction 

This is an argument for granting the right to assisted suicide to a particular 
individual, as opposed to an argument for an assisted suicide policy , as found in 
several countries in Europe and a couple American states, and which would provide 
access to assisted suicide to any Canadian who meets certain requirements. I will 
adopt a narrative where the party criticizing this proof is the supreme court justices 
who voted to deny Sue Rodriguez's petition [see decision] . Exactly the same argument 
works for the more-recent case of Gloria Taylor, who won her case for access to 
assisted suicide at the British Columbia Supreme Court in 2012 . and lost at the B.C. 
Court of Appeal in 2013). In Sue Rodriguez's particular case, no major party to the 
argument argued that the government would be doing her harm by making assisted 
suicide legal for her (this is Assumption 7). Thus, the argument comes down to 
whether allowing Sue Rodriguez (S.R.) access to assisted suicide would have a 
negative effect of some sort (against other people - see Assumption 5; or abstract 
principles - see Assumption 4 and Assumption 6) that rivals the negative effect of 
denying her access. The main goals of this argument are: 

1. To clarify the qualitative cost to S.R. of denying her access to legal assisted 
suicide. 

2. To more-precisely state the position that (1) exceeds any cost incurred if the 
Supreme Court were to grant her access. Or rather, that no such cost has been 
presented, and because of that she should have been granted access. 

One major difference between this formal argument and informal, natural language 
arguments about assisted suicide cases (those that I've encountered) is the careful 
distinguishing between 

• actions that individuals can do 

• states of affairs that individuals want to achieve, which are achievable by their 
taking certain actions 

• states of affairs that we try to prevent by criminalizing certain actions 
The primitive symbols of the language of this proof only speak directly about 
criminalizing actions (sort Actions). The language speaks indirectly about 
criminalizing states of affairs (sort Propositions) via the defined predicate Justifies 
criminalizing satisfaction of>; we can say that the law criminalizes a state of affairs if 
it criminalizes every action that can achieve that state of affairs. As an example, in 
many jurisdictions the law indirectly criminalizes any state of affairs i|i in which a 
young person is high on crack cocaine (S). Suppose I want to justify that law, but 
without taking a moral stance on whether it is fundamentally wrong to use crack 
cocaine. Instead, I'll justify it in terms of the desired state of affairs 8 that no young 
person is at risk of becoming addicted to crack cocaine. Then I must do two things: 

1. Argue that the only actions that can achieve iji would falsify 8. 

2. Argue, or assert, the subjective position that the satisfaction of 8 is more 
important than the satisfaction of t|). 
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It is important to make these distinctions for this ethical issue for two reasons: 

First, because it allows two people to disagree on part of the law while agreeing on 
a subjective moral position such as (2). For example, it is conceivable that in the 
future a drug is invented that somehow counteracts the addictive properties of crack 
cocaine. In such a future, (1) is much easier to reject, and if I reject it then I can argue 
that 6 does not justify criminalizing ip while still agreeing with (2). In more detail, 
imagine such an anti-addiction drug is invented, a combination of it and crack cocaine 
is manufactured, and the combination drug has the property that it is more costly to 
separate its two component parts than it is to make crack cocaine from scratch. An 
action a involving the manufacturing and selling of the combination drug is sufficient 
to attain ip, but arguably does not risk falsifying 6. In the language of this argument: a 
E <actions sufficient for>(ip) a -iConflicts(a,6). 

Second, because it prevents one side of the argument from misrepresenting the 
opinion of the other side. In this example, it prevents supporters of the current law 
from misrepresenting the opinion of opponents as (being close to ) fundamentally 
favouring the falsifying of 6 (likely leading to addictions), and it prevents opponents 
of the current law from misrepresenting the opinion of supporters as (being close to) 
fundamentally favouring the impermissibility off (that being high on crack cocaine is 
fundamentally wrong). Such misrepresentation happens often in informal 
argumentation, even sometimes unintentionally! 

For the argument below about S.R.'s case, «SueR request» has the role of ip. Note 
that its description does not explicitly mention assisted suicide. The conjunction of the 
elements of sc-concems has the role of 8. As in the previous paragraph, this prevents 
misrepresentation of the opinions of the two sides of the issue: The justices who voted 
against S.R. were not arguing that assisting the suicide of another person is 
fundamentally wrong/impermissible, and the justices who voted in favour of S.R. 
were not arguing that all people have a fundamental right to choose when they will 
die. However, since I am arguing that the court ruled incorrectly, I must connect 
«SueR request» to the issue of assisted suicide. That is the purpose of /Lemma 1; it 
says that the only actions S.R. can take that achieve «SueR request» are ones that 
involve a physician giving her access to lethal drugs (defined by <assisted suicide 
actions for SueR>). 

Let's next focus on the goal sentence /Goal of this argument (the final sentence 
derived from the axioms and lemmas): 

-i Justifies criminalizing satisfaction of>(sc-concems,«SueR request») 

sc-concerns is a set of four propositions that were raised by the justices in the 
majority opinion as desirably-satisfied propositions that might be falsified if they rule 
in favour of S.R. (®). The goal sentence says that those concerns are not enough to 
justify criminalizing the satisfaction of «SueR request» (by criminalizing every action 
that can achieve «SueR request»). 

We need a principle of law/morality that connects -KJustifies criminalizing 
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satisfaction of>(sc-concerns,«SueR request») and the rest of the proof. This is the 
predicate <Reach of law limit> (<- hover cursor to see definition, which will be 
discussed shortly). I've made it a predicate instead of an axiom to avoid having to 
posit that the principle holds generally. Instead it is only assumed for one instance, by 
Assumption 1. That said, I do accept the principle generally with some additional 
qualification. See the end of the criticism section at the bottom of this page for the 
improved version <Reach of law limit 2> with additional qualification. 

You might think that the defining formula for <Reach of law limit> is surprisingly 
complicated. There is a simpler, but stronger principle (see definition of <Reach of law 
limit> for discussion of its weakness), that also suffices to derive /Goal. It says a set 
of concerns A does not justify criminalizing an action a if that action can accomplish a 
proposition if such that for each of the concerns 6, either a does not conflict with 6, or 
6 is not more important than if. I'll now explain why I am reluctant to use this simpler 
principle. The problem is that if is not adequately constrained by a. Consider the 
contrapositive: If the concerns A justify criminalizing an action a, then for every 
proposition if that a accomplishes, there must be a concern 5 E A such that a conflicts 
with 6 and 6 is more important than if. That consequent is too strong a requirement in 
some cases! Suppose we want to justify the criminalizing of an action that 
accomplishes something very good while unnecessarily accomplishing something bad. 
I'll use an example based on one given by Paul McNamara in a slightly different 
deontic logic context. We want to argue that it should be criminal to perform the 
action a of intentionally and unnecessarily breaking a person Timmy's fingers even if 
it is done while saving Timmy from a fire. The simpler-but-stronger principle that I 
am reluctant to use says that if we believe that (*) := "Timmy is saved and his fingers 
are broken" is more important than "Timmy's fingers are not broken", then we cannot 
consistently justify illegalizing a using just Timmy's desire 6 to not have broken 
fingers. In contrast, with <Reach of law limit> we may consistently criminalize a and 
believe (*), under the reasonable assumption that it is possible to save Timmy without 
breaking his fingers. 

Argument 

Variables 6, if, if i, if 2 , if 3 , if 4 are reserved for sort Propositions. 

Variables A, A' are reserved for sort Set(Propositions). 

Variable a is reserved for sort Actions. 

Variables X, Y, Z, Yi, Y 2 , Y 3 are reserved for sort Set(Actions). 

Variable p is reserved for sort People. 

Sort op Set - Powerset of the given sort. 

Sort Actions B 

Potential concrete actions of individuals. A set each element of which is a 
concrete/unrepeatable, potential action. By "concrete/unrepeatable", I mean that 
each action has a definite location (resp. interval of time) where (resp. when) it 


Chapter 4. Example: Sue Rodriguez’s supreme court case 


43 


would hypothetically occur (B). Also, each element of this set can be associated 
with a unique person who performs the action. 

Sort Propositions B 

Propositions about the real world; things that will turn out true or false. Each will 
be satisfied or not in every model, but it is nonetheless important that they are 
distinct from 0-ary predicate symbols, as we will have functions with domain 
Propositions. 

Sort People B 

The set of residents of Canada who are alive sometime during 1993 (the year of 
S.R.'s supreme court hearing) or later. 

Show standard symbols and axioms 


Performed : Actions —* EB - The potential actions that are actually performed. 

Satisfied : Propositions -> B> - The propositions that turn out true. 

S.R. : People - Sue Rodriguez 

«S.R. facts» : Propositions B 

The conjunction of the following list of facts (satisfied propositions) about Sue 
Rodriguez: 

1. S.R. has ALS, a usually fatal disease, and multiple doctors have given their 
opinion that her life expectancy is short. 

2. In the late stages of the disease, S.R.'s movement will be greatly restricted. 
If she wishes to live until then (and she does), she will not be able to take 
her own life without assistance. 

3. There is no dispute about whether assisted suicide is truly what S.R. wants, 
as evidenced, for example, by the testimony of her friends and family, lack 
of contradicting testimony from anyone, and her involvement in the Death 
with Dignity movement. 

4. And many more. If some of the assumptions below are expanded into 
proved lemmas, more facts will be added to this list, and at some point it 
may be prudent to break this constant up into a number of constants, or a 
constant of type List(Propositions), so that the individual facts can be 
referred to and criticized more formally. 

«SueR request» : Propositions B 

Satisfied iff Sue Rodriguez becomes confident that she will be able to take some 
action (in Actions) such that each of the following hold: 

1. After completion of the action, she has no further severe pain or indignity 
caused by her illness. 

2. The action, during its execution, does not cause pain or unusual physical or 
psychological discomfort. 

3. Her doing the action does not put any friend or loved one at risk of being 
convicted of a criminal offence. 

4. She does not need to leave Canada permanently in order to do the action. 

5. If the action causes permanent loss of consciousness, then it does not occur 
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until a time when she believes she is no longer able to enjoy life (which she 
expected would be after losing most of her mobility). 

<assisted suicide actions for SueR> : Set(Actions) B 

The set of actions in which Sue Rodriguez, while in Canada, legally obtains, for 
the purpose of ending her life, a lethal dose of barbiturates, morphine, or any other 
drug that is reliably painless, and reliably induces sleeping before it induces loss 
of consciousness and then death. 
a sr : Actions B 

Any element of <assisted suicide actions for SueR> . It may be necessary to make it 
more specific if Assumption 7 is criticized. 

<actions sufficient for> : Propositions -» Set(Actions) B 

The set of all actions that are feasible and can be expected to result in the given 
proposition being satisfied. 

<More important than> : Propositions x Propositions -> B B 

A vague (but sufficiently precisifiable) and highly subjective partial order. 
Example that everyone in the intended audience of this argument should agree on: 
specific propositions corresponding to instances of "the right to not be murdered" 
are typically more important than specific propositions corresponding to instances 
of "the right to free speech". 

Conflicts : Actions x Propositions -» 00 

If the relation holds for (a,i|i) then a conflicts with ap in the sense that if a is 
performed then (causally, directly or indirectly) cannot be satisfied, 
not : Propositions -> Propositions B 

The proposition that is satisfied iff the given proposition is not satisfied, 
regrettable legal assisted suicide> : People ->B0 

regrettable legal assisted suicide>(p) is the proposition that is satisfied iff: 

1. p dies by a legally-sanctioned use of assisted suicide. 

2. There is some information about p, which was unknown at the time when 
their application for assisted suicide was approved, that, if it had been 
known, would have caused a significant proportion (say, 5%) of people who 
would have supported p's application to resolutely change their mind. Here 
"resolutely" means that no further information about p would again change 
the minds of those 5% of people. We more simply (but slightly more 
vaguely) say that at least 5% of p's supporters, if given "perfect" 
information about p, would change their minds, 

«avoid judicial overreach» : Propositions B 

The ruling of the supreme court justices on Sue Rodriguez's case does not 
constitute "judicial overreach". 

«avoid legal precedent causing slippery slope» : Propositions B 

Satisfied if «SueR request» is not satisfied, or if «SueR request» is satisfied and a 
certain kind of "slippery slope" is blocked; in particular, permitting S.R.'s assisted 
suicide request does not "lead to" the Supreme Court or a lower court permitting a 
regrettable (see regrettable legal assisted suicide>) instance of assisted suicide. 
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Defn~ «no regrettable legal assisted suicide for S.R.» : Propositions - The proposition 
that is satisfied iff regrettable legal assisted suicide>(S.R.) is not satisfied. E) 

Satisfied(«no regrettable legal assisted suicide for S.R.») <=> -< regrettable legal 
assisted suicide>(S.R.) 

consistency with maj opinion» : Propositions El 

The proposition that is satisfied iff the decision made by the Supreme Court is "in 
agreement with the majority opinion" of Canadian citizens on whether S.R. should 
be granted an exception to the criminal code. 

Defn sc-concerns : Set(Propositions) - "sc" for Supreme Court. Some of the concerns 
of opponents of assisted suicide, formulated as propositions that they want to be true. 
Specifically, they are the concerns mentioned in the majority opinion for the actual 
Supreme Court decision. E) 

sc-concems = {«avoid judicial overreach», «avoid legal precedent causing 
slippery slope», «no regrettable legal assisted suicide for S.R.», consistency with 
maj opinion»} 

Justifies criminalizing> : Set(Propositions) x Set(Actions) -» B E) 

The desirability of satisfying the given propositions justifies criminalizing the 
given actions. 

Defn Justifies criminalizing satisfaction of>(A, if) : Set(Propositions) x Propositions 
— * EB - The desirability of satisfying the given propositions A justifies criminalizing 
the satisfaction of if .El 

VA,if . Justifies criminalizing satisfaction of>(A, if) <=> Justifies criminalizing> 
(A, <actions sufficient for>(if)) 

Defn <Reach of law limit> : Set(Propositions) x Propositions —> B - The defining 
formula of this predicate is a general, but weak, principle of liberalism, which we will 
use one instance of (Assumption 1). It only requires justification for laws that 
criminalize all possible ways of accomplishing a proposition if, saying nothing about 
laws that criminalize, without justification, some but not all actions than can 
accomplish if. 

It says that the set of (ostensibly desired) propositions A does not justify criminalizing 
the satisfaction of the proposition if (by criminalizing all the actions that can 
accomplish if) if there is an action that can accomplish if such that, for each of the 
desired propositions 6 G A, either 6 is not more important than if, or the action does 
not conflict with 6. 

Contrapositive: Suppose that a set of (ostensibly desired) propositions A justifies 
criminalizing the set of all actions that can accomplish another proposition if. 
Intuitively, this means A justifies criminalizing the satisfaction of if. Then, it must be 
that for each of those actions a that can achieve if, there is a proposition 6 G A that 
both conflicts with a and is more important than if. E) 

VA. Vif . <Reach of law limit>(A, if) <=> ((3a G <actions sufficient for>(if). V6 G 
A. -i<More important than>(6, if) v -iConflicts(a, 6)) => -< Justifies criminalizing 
satisfaction of>(A, if)) 
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0 /Goal: The specific set of concerns sc-concerns does not justify criminalyzing the 
satisfaction of «SueR request». B 

-■Justifies criminalizing satisfaction of>(sc-concerns, «SueR request») 

The theorem is a logical consequence of the following axioms; the code that 
generates this HTML file also generates first-order validity problems, which were 
solved by CVC4 and Vampire via System on TPTP . Each axiom can be disputed, 
and some, with more work, can be made into lemmas, proved from more-basic 
assumptions and simplifying assumptions. Each axiom is informally labeled an 
Assumption or Assertion. The Assertions are intended to be uncontroversial. 

Assumption 1: Suppose there is some action a that S.R. can take to achieve «SueR 
request» such that, for any concern ij) E sc-concerns that is not strictly less 
important than «SueR request», the action a does not actually conflict with i|u 
Then sc-concems does not justify criminalizing the set of all actions that can 
achieve «SueR request». B 

<Reach of law limit>(sc-concerns, «SueR request») 

B /Lemma 1: There are no actions that can achieve S.R.'s request other than the 
ones described above in <assisted suicide actions for SueR> (all of which involve her 
use of assisted suicide). B 

<assisted suicide actions for SueR> = <actions sufficient for>(«SueR request») 
Argument sketch: I claim that there are three broad categories of actions that 
might plausibly be able to achieve the satisfaction of «SueR request»: (1) 
treatment, (2) suicide, or (3) palliative sedation (aka terminal sedation). In 
Rodriguez's and Taylor's cases, there are no sufficient treatments for ALS and 
there is no hope for the discovery and availability of a new one before their 
death, so (1) is out. To dismiss (3) one must do some reading, e.g. Palliative 
Sedation: It’s Not a Panacea : in short, the ideal of terminal sedation, in which 
a dying patient's life is not shortened, but all their suffering is medicated away, 
is far from achieved in practice. If it was ideal, we would include in <actions 
sufficient for>(«SueR request») actions by which a patient is guaranteed 
access to terminal sedation (there is currently no general way of getting such a 
guarantee in Canada; one has to just get lucky to end up with a doctor who is 
willing to do it). 

A(,A$,Aps . Set(Actions) B 

Actions that could plausibly achieve, respectively, a treatment/cure, 
suicide (including assisted suicide), or palliative sedation, for S.R. 

B Assertion 1: Every action that can achieve S.R.'s request is one involving a 

treatment/cure of her condition, some form of suicide, or some form of palliative 

sedation. B 

<actions sufficient for>(«SueR request») QA t U A s U A ps 

Argument sketch: I claim that there are three broad categories of actions 
that might plausibly be able to achieve the satisfaction of «SueR request»: 
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(1) treatments/cures, (2) suicide, or (3) palliative sedation (aka terminal 
sedation). I don't anticipate that this would be disputed. 

0 Assertion 2: Treatment/cure-seeking actions cannot achieve S.R.'s request.0 
<actions sufficient for>(«SueR request») n A t = 0 

Argument sketch: While ALS could be effectively treated or cured some 
day, «SueR request» would require the discovery and minimal testing of 
such a treatment within a year or two, and there is negligible hope within 
the medical community for that. 

B Assumption 2: Palliative sedation cannot achieve S.R.'s request.0 
<actions sufficient for>(«SueR request») (1 A ps = 0 

Argument sketch: See Palliative Sedation: It’s Not a Panacea : in short, the 
ideal of terminal sedation, in which a dying patient's life is not shortened, 
but all their suffering is medicated away, is far from achieved in practice. 
If it was ideal, we would need to include, in <actions sufficient for>(«SueR 
request»), actions through which a patient becomes guaranteed access to 
terminal sedation. However, there is currently no general way of getting 
such a guarantee in Canada; one has to just get lucky to end up with a 
doctor who is willing to do it. 

0 Assumption 3: Among the possible ways that S.R. could end her own life, 
only those in <assisted suicide actions for SueR> satisfy the criteria of «SueR 
request».0 

<actions sufficient for>(«SueR request») n A s = <assisted suicide actions for 
SueR> 

Argument sketch: Proving this involves a morose consideration of all the 
known methods of suicide, observing that each of them, besides the use of 
legally-prescribed sedatives, violates at least one of the conditions of 
«SueR request». 

Assertion 3: The uncontroversial assertion that the actions described in 
<assisted suicide actions for SueR> would suffice to meet S.R.'s desired 
condition «SueR request».0 

<assisted suicide actions for SueR> C <actions sufficient for>(«SueR 
request») 

Assertion 4: The specific action asR is in <assisted suicide actions for SueR> 
(informally "by definition").0 

asR E <assisted suicide actions for SueR> 

0 Assumption 4: The judges' necessary involvement in making it legal for S.R. to 
obtain lethal prescription drugs need not constititue judicial overreach. 0 
-Conflicts (a si? , «avoid judicial overreach») 

Argument sketch: Claim that a suspended annulment, with a period of at least 
one year, plus a special waiver for one person, is never judicial overreach. In a 
suspended annulment, a law is declared unconstitutional, but it is allowed to 
remain in effect for a period of time, to give the legislative branch the 
opportunity to replace it with a new, constitutional law. Reading: Myth of 
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Judicial Overreach 

E) Assumption 5: The judges' necessary involvement in making it legal for S.R. to 
obtain lethal prescription drugs need not create a legal precedent that leads to a 
"slippery slope". E) 

-iConflicts(a.w , «avoid legal precedent causing slippery slope») 

Argument sketch: Though the justices may not be able to artificially specify 
that their ruling in favour of S.R. should not be used as precedent, they can 
certainly restrict the extent of the precedent, by specifying only that the law is 
unconstitutional for any citzen satisfying «S.R. facts». A citizen petitioning a 
lower court for access to assisted suicide, who does not satisfy all of «S.R. 
facts», would be neither helped nor hindered by the ruling in favour of S.R. 
Given that consideration, we can use the same argument that we use for 
Assumption 7 to justify this assumption (since in that argument «S.R. facts» 
are the only facts we use about S.R.). 

El Assumption 6: S.R.'s claim for «SueR request» trumps her opponents' claim for 
Supreme Court decisions to be consistent with the majority opinion (among 
Canadian citizens). El 

-i<More important than>^consistency with maj opinion», «SueR request») 
Argument sketch: The goal of protection against tyranny of the majority is 
precisely what makes the constitution special, compared to other laws. My 
opinion is that consistency with maj opinion» should be given very small (if 
not zero) weight when assessing whether a part of the law should be repealed 
on constitutional grounds. To replace this assumption with a high-level 

proof...® 

El Assumption 7: The particular assisted suicide acton cisr that we chose does not 
conflict with the desire to avoid S.R. being the victim of a regrettable legal assisted 
suicide> . E) 

-Conflicts (a sr , «no regrettable legal assisted suicide for S.R.») 

Argument sketch: This is easily argued by reference to «S.R. facts». 


Hide sample criticism 


Criticizing the argument 

The justices that ruled against Sue Rodriguez (hereafter called "the majority", as in the 
decision itself) and wrote the majority opinion in the Rodriguez ruling implicitly 
criticized Assumption 4, Assumption 5, and Assumption 6, but not Assumption 7. 
Their arguments did not significantly touch on the details considered in /Lemma 1, 
which is not surprising as they did not attempt to reason with non-trivial precision 
about the costs and benefits for Sue Rodriguez. 

Take Assumption 6 for example. The authors of the majority opinion write: "the issue 
before the Court was whether a criminal prohibition on assisting suicide in situations 
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where a person is terminally ill and mentally competent but unable to commit suicide 
by him or herself, is contrary to the principles of fundamental justice. What are 
principles of fundamental justice? Mr. Justice Sopinka noted that determining these 
principles can be an onerous task. Such principles, he pointed out, are those for which 
there is some consensus among reasonable people as to their importance to our 
societal concept of justice." 

From that statement, we can see that the majority rather flatly disagree with the 
opinion I express in the prose below the statement of Assumption 6. So how to 
proceed after that? I could dispute their claim about the consensus of the Canadian 
public (probably hard, but that situation is slowly improving in favour of assisted 
suicide), or I could look for other cases that demonstrate the justices informally 
contradicting their allegiance to the principle expressed in the above quote (probably 
easy), or I could take this as a fundamental source of subjective disagreement. For the 
last option, I would move on to their criticisms of the other assumptions, and try to 
refute them (by demonstrating informal inconsistency), so that we are left with only 
one source of fundamental disagreement. 

Another possible criticism is that sc-concerns is too small a set. A critic may wish to 
add an entirely new Propositions constant to sc-concerns, or perhaps the conjunction 
of two or more of the current elements of sc-concerns. An addition of the latter type 
could conceivably have a significant effect if combined with an addition of the first 
type, since Assumption 1 considers the elements of the first argument to <Reach of 
law limit> separately (an addition of the first type would be necessary to really force 
me to respond in a challenging way, since currently only one of the 4 primitive 
concerns are compared to «SueR request» with <More important than>). In any case, a 
language modification and extension that makes <Reach of law limit> respect 
conjunctions of elements of sc-concerns is as follows: 

and : Set(Propositions) Propositions - Conjunction of the given set of propositions. 
C ; Set(Propositions) x Set(Propositions) -> B - Subset 
\ : Set(Propositions) x Set(Propositions) -> Set(Propositions) - Set difference 
Defn <Reach of law limit 2> : Set(Propositions) x Propositions —* B - <Reach of law 
limit 2>(A,i|t) asserts the following statement, which is an implication. If A can be 
partitioned into two sets A' and AVY such that 

• the conjunction of the concerns A' is not more important than t|t, and 

• there is an action a that can accomplish t|t such that none of the concerns in A 
VY actually conflict with a 

then A does not justify criminalizing all the actions that can accomplish tji.B 

VA,i|l <Reach of law limit 2>(A, t|t) <=> ((3A' C A. -i<More important than>(and(A 
'), tit) a (3a G <actions sufficient for>(i|i). V6 G AW. -Conflicts (a, 6))) => 

-i< Justifies criminalizing satisfaction of>(A, xp )) 


Chapter 5 


Example: Berkeley gender bias lawsuit 

The following table summarizes UC Berkeley’s Fall 1973 admissions data for its six 
largest departments. Across all six departments, the acceptance rates for men and women 
are about 44.5% and 30.4% respectively. The large observed bias prompted a lawsuit 
against the university, alleging gender discrimination. 1 In [BH075] it was argued that 
the observed bias was actually due to a tendency of women to disproportionately apply 
to departments that have high rejection rates for both sexes. 


Male Female Total 


Department 

Applied 

Accepted 

Applied 

Accepted 

Applied 

Accepted 

D 1 

825 

512 (62%) 

108 

89 (82%) 

933 

601 (64%) 

d 2 

560 

353 (63%) 

25 

17 (68%) 

585 

370 (63%) 

d 3 

325 

120 (37%) 

593 

202 (34%) 

918 

322 (35%) 

d 4 

417 

138 (33%) 

375 

131 (35%) 

792 

269 (34%) 

d 5 

191 

53 (28%) 

393 

94 (24%) 

584 

147 (25%) 

D, 6 

373 

22 (6%) 

341 

24 (7%) 

714 

46 (6%) 


The first argument I give is similar to the final analysis given in [BH075], 2 though 
it makes weaker assumptions (Assumption 8 in particular: their corresponding, implicit 
assumption is obtained by replacing the parameters .037 and 9 with Os). The argument 
resolves the apparent paradox by assuming a sufficiently-precise definition of “gender 

1 The data given is apparently the only data that has been made public. The lawsuit was based on the 
data from all 101 graduate departments, which showed a pattern similar to what the data from the 6 
largest shows. 

2 The paper is written to convey the subtlety of the statistical phenomenon involved (an instance of 
“Simpson’s Paradox”), and so considers several poor choices of statistical analyses before arriving at the 
final one. 
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discrimination” and reasoning from there. More specifically, it first fixes a definition of 
“gender discrimination’’ and then defines (in natural language) a hypothetical admissions 
protocol that prevents gender discrimination by design. Considering then a hypothetical 
round-of-admissions scenario that has the same set of applications as in the actual round 
of admissions, if we assume that the ungendered departmental acceptance rates are not 
much different in the hypothetical scenario, then it can be shown that the overall bias is 
actually worse for women in the hypothetical scenario. Since the hypothetical scenario 
has no gender discrimination by design, and is otherwise as similar as possible to the real 
scenario, we conclude that the observed bias cannot be blamed on gender discrimination. 

The second argument tells us why it is that our vagueness about “gender discrimination” 
resulted in an apparent paradox; namely, we were implicitly admitting definitions of “gender 
discrimination” that allow for the question of the presence/absence of discrimination to 
depend on whether or not the sexes apply to different departments at different rates. If 
we forbid such definitions, then to prove that the gendered departmental acceptance rates 
do not constitute gender discrimination, it should suffice to show that there is an overall 
bias in favour of women in any hypothetical admissions round in which the gendered 
departmental acceptance rates are close to what they actually were, and where men and 
women apply to each department at close to the same rate. 

I’ll use g to refer to the language interpretation guide for the language £ of this 
argument. 

£\£rigy consists of: 

• The constant Acc hyp . 

• The propositional variables (i.e. 0-ary predicate symbols) (bias only evidence), 
(lawsuit should be dismissed), (gender uncor with ability in each dept). 

£ rigid consists of: 

• A number of mathematical symbols that have their standard meaning: constants 

0,1,512, 825,..., function symbols | • |, n, u, /, predicate symbols <, =. 

• The constants App, Acc, App m , App-^, App 1 ,..., App 6 . Since the elements of these 
sets are not in the universe, their semantics are determined by axioms that assert 
their sizes and the sizes of sets formed by intersecting and unioning them with each 
other. 

• The sorts are A for application sets and O 1 for the rational numbers with an element 
for “undefined”. See below for g’s entries for them. 

The types of the function/predicate symbols other than the 0-ary predicate symbols 
(and besides =, which is untyped) are as follows. With respect to the definition of 
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interpreted formal proof from Section 2.2, they are all assumptions as opposed to simplifying 
assumptions. 


App, Acc, Acc hyp 

, App" 1 , App', 

: A 

App 1 ,.. 

•, App 6 




1 ' 1 

A^Q 1 


0, 

1,512,825,... 

Q 1 



n, u 

A x A -> A 




Q 1 x Q 1 -> Q 1 


/ : Q 1 x Q 1 — Q 1 

< : Q 1 x Q 1 ^ B 4 


5.1 First argument 

The goal sentence is the following implication involving propositional variables whose 
informal meanings, given by the language interpretation guide g, will be given next. 

(gender uncor with ability in each dept) a (bias only evidence) => (lawsuit should be dismissed) 

g((bias only evidence)) consists of the above table, and then the assertion: “The bias 
shown in the data is the only evidence put forward by the group who accused Berkeley of 
gender discrimination.” 

(/((gender uncor with ability in each dept)) we take to be just “Assumption 1” from 
[BH075], which I quote here: 

Assumption 1 is that in any given discipline male and female applicants do 
not differ in respect of their intelligence, skill, qualifications, promise, or other 
attribute deemed legitimately pertinent to their acceptance as students. It is 
precisely this assumption that makes the study of "sex bias " meaningful, for 
if we did not hold it any differences in acceptance of applicants by sex could 
be attributed to differences in their qualifications, promise as scholars, and so 
on. Theoretically one could test the assumption, for example, by examining 
presumably unbiased estimators of academic qualification such as Graduate 
Record Examination scores, undergraduate grade point averages, and so on. 

There are, however, enormous practical difficulties in this. We therefore 
predicate our discussion on the validity of assumption 1 . [BH075] 

(/((lawsuit should be dismissed)) = The judge hearing the suit against Berkeley should 
dismiss the suit on grounds of lack of evidence. 
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giQ 1 ) = The rational numbers plus one extra object for err or/undefined. 

g(A) = The powerset of App. Note that the individual applications are not in the 
universe of discourse (though each singleton set is), since they are not required for the 
proof. 

g also says that 

• 0,1,512, etc are the expected numerals. 

• | • | is the function that gives the size of each set in A. 

• n, u are the expected binary functions on A. 

• +, — ,* are the expected binary functions on the naturals extended so that they 
equals T when either or both of the arguments are _L. 

• / is division on the rationals extended so that it equals T iff one or both of the 
arguments are T or the second argument is 0. 

• < is the usual ordering on the rationals extended by making T be neither greater 
than nor less than any number or itself. 

Recall that the next 11 symbols are all 0-ary constant symbols. 

(/(App) = App is the set of applications. Its size is 4526 (sum of the entries in the two 
“Applied” columns of the table). 

g{ Acc) = Acc is the set of (actual) accepted applications. Its size is 1755 (sum of the 
entries in the two “Accepted” columns of the table). 

g{ Acc hyp ) = We need a sufficiently-precise, context-specific definition of “gender discrimina¬ 
tion’,’ and to get it we imagine a hypothetical scenario. An alternative admissions process 
is used, which starts with exactly the same set of applications App, and then involves 
an elaborate 5 , manual process of masking the gender on each of them (including any 
publications and other supporting materials). The application reviewers, while reading 
the applications and making their decisions, are locked in a room together without access 
to outside information, except that interviews are done over computer using an instant 
messaging client (which, of course, is monitored to make sure the gender of the applicant 
remains ambiguous). Then, Acc hyp is the set of accepted applications in the hypothetical 
scenario. 

g( App m ) = App” 1 is a subset of App of size 2691 (sum of the first “Applied” column in the 
table), specifically the applications where the applicant is male. 

g(App^) = App^ is a subset of App of size 1835 (sum of the second “Applied” column in 


5 It need not be efficient/economical, since we are only introducing the scenario as a reasoning device. 
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the table), specifically the applications where the applicant is female. 

For d = 1,..., 6: 

9( App d ) = App d is the set of applications for admission into department d. 


Definition 1. For g e {m, /} and d e {1,..., 6}: 

App := App m tb App-^ 


App d ’ 9 := App d n App 9 
Acc d ’ 9 := App d,a n Acc 
A< 9 := App d ’ 9 n Acc hyp 

Definition 2. For x,y, z e Q , we write z e [x + y] for x — y^z^x + y. 


Assumption 7. In the hypothetical scenario, the number of applicants of gender g 
accepted to department d is as close as possible to what we’d expect assuming that gender 
is uncorrelated with ability within the set of applicants to department d. For d e {1,..., 6} 
and g e {m, /}: 

(gender uncor with ability in each dept) => 


A cc d ’ 9 

ACC hyp 


A rr d 
ACC hyp 


I App 


d,g | 


A rc d ' 9 

rtCC hyp 

AcCh yp 


|App d | 

|App d,9 | 

|App d | 


+ l /2 


Assumption 8. Assuming that gender is uncorrelated with ability within the set of 
applicants to department d, the number of applicants accepted to department d in the 
hypothetical scenario is close to the number accepted in the real scenario. That is, the 
overall, non-gendered departmental acceptance rates do not change much when we switch 
to gender-blind reviews. We require that a model satisfies at least one of the following 
two quantifications of that idea. For d e {!,... ,6}: 


(gender uncor with ability in each dept) => 

/\itM6 1Acc d | ■ (1 - .037) « |A C Ch yp | « |Acc d | • (1 + .037)) 
Ai. |AcCh yp | e [ Acc"' + 9]) 


V 
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The constants .037 and 9 are roughly the most extreme values that make the proof go 
through. To illustrate the first form, the bounds for the departments with the fewest and 
greatest number of accepted applicants are: 

45 < | Acc^ypl < 47 and 579 < |Acc^ yp | < 623 

Definition 3. For g e {' m,f }: 


accRate 3 ;= Acc 3 /App 3 and accRate^ yp ;= Accj( yp /App 3 

Assumption 9. If (bias only evidence) and 

accRate^ p accRate™ 
accRate^ yp accRate^ 

then (lawsuit should be dismissed) 

Simplifying Assumption 2. (bias only evidence) 


Claim 1. 


(gender uncor with ability in each dept) 


accRate^p accRate m 
accRate^ y p accRate^ 


Proof. It is not hard to formulate this as a linear integer programming problem, where 
the variables are the sizes of the sets Accj) yp . Coming up with inequalities that express the 
previous axioms and the data axioms from Section 5.3 is easy. Reduce the Claim itself 
to a linear inequality, and then negate it. One can then proof using any decent integer 
programming solver that the resulting system of equations is unsatishable. □ 


Claim 2. The goal sentence easily follows from the previous three propositions. 


(gender uncor with ability in each dept) a (bias only evidence) => (lawsuit should be dismissed) 


5.2 Second argument 

This second argument better captures the intuition of the usual informal resolution of 
the apparent paradox; the observed bias is completely explained by the fact that women 
favored highly-competitive departments (meaning, with higher rejection rates) more so 
than men. We show that there is an overall bias in favour of women in any hypothetical 
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admissions round in which the gendered departmental acceptance rates are close to what 
they actually were, and where men and women apply to each department at close to the 
same rate. 

In this argument, the set of applications in the hypothetical scenario can be different 
from those in the real scenario, so we introduce the new symbols App|( yp : A for 1 < d T 6. 

The hypothetical admissions round is similar to the true admissions round (Axioms 
10 and 12) except that men and women apply to each department at close to the same 
rate (Assumption 11) - meaning the fraction of male applications that go to department 
d is close to the fraction of female applications that go to department d. We need to 
update the language interpretation guide entries g(Appj( yp ) and g(Acc hyp ) to reflect these 
alternate assumptions. 

This proof uses Definitions 1 and 2 from the previous proof. 

Assumption 10. In the hypothetical round of admissions, the total number of applica¬ 
tions to department d is the same as in the actual round of admissions. Likewise for the 
total number of applications from men and women. 6 * * 
For d e {1,..., 6} and g e {m, /}: 


A PPhypl = |App d |, I App^ yp | = |App 9 


Assumption 11. In the hypothetical scenario, gendered departmental application rates 
are close to gender-independent. For d e {1,..., 6} and g e {m, / }: 


I App 


d,g | 

hyp I 


G 


l A PPhyp 


|ApPhyp| 

|ApP hyp | 


Assumption 12. In the hypothetical scenario, gendered departmental acceptance rates 
are close to the same as in the real scenario. 

For d e {1,..., 6} and g e {m, /}: 


IA cc d ’ 9 I 

l ACC hypl 


|Acc d,9 | 


I App 


d,g | 


|A P Phvp| ± 6 


Claim 3. accRate{ yp > accRate^ p 

Proof. As in the previous proof, it is easy to reduce this to a linear integer programming 

problem. Coming up with constraints that express the previous axioms and the data 

6 This axiom could be weakened in principle, by replacing the equations with bounds, but doing so in 

the obvious way introduces nonlinear constraints, and then I would need to use a different constraint 

solver. 
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axioms from the next section is easy. Then, add the constraint 

( 2 l ACC hyp| N | /\ A PP f \ < ( 2 l ACC hypl) /l A PP m | 

which expresses the negation of the Claim (recall that |App m | and | App-^j are constants). 
Finally, prove that the resulting system of equations is unsatishable. □ 

Assumption 13. If (bias only evidence) and accRate{ yp > accRate^ p 
then (lawsuit should be dismissed) 

Simplifying Assumption 2 from the previous proof, which just asserts (bias only evidence), 
is also used here. From it, Assumption 13, and Claim 3, the goal sentence 

(lawsuit should be dismissed) follows immediately. 

5.3 Data Axioms 

Assumption 14. 


App = 4526, /\ App d c App 

i, Acc c App, Acc hyp c App 

App 1,m = 825, 

Acc 1,m | = 512, 

App 1 ’^ 

= 108, 

Acc 1 ’^ 

| = 89 

App 2,m | = 560, 

Acc 2,m | = 353, 

App 2 ’^ 

| = 25, | 

Acc 2,/ | 

= 17 

App 3,m | = 325, 

Acc 3,m | = 120, 

App 3 ’-^ 

= 593, | 

Acc 3 ’^ 

= 202 

App 4,m | = 417, 

> 

n 

n 

II 

oo 

00 

App 4 ’^ 

= 375, | 

> 

n 

n 

= 131 

App 5,m | = 191, 

Acc 5,m | = 53, 

|App 5>/ | 

= 393, | 

Acc 5 ’^ 

= 94 

App 6,m | = 373, 

Acc 6,m | = 22, 

|App 6l/ | 

= 341, | 

Acc 6 ’^ 

= 24 


That App is the disjoint union of App 1 ,..., App 6 follows from the previous sentences 
(under the standard interpretation of numbers and sets). 


Chapter 6 

Example: Leighton Hay’s wrongful 
conviction 


Leighton Hay is one of two men convicted of murdering a man in an Ontario nightclub 
in 2002. The other man, Gary Eunich, is certainly guilty, but evidence against Hay is 
weak- much weaker, in my opinion and in the opinion of the Association in Defense of 
the Wrongly Accused (AIDWYC) 1 , than should have been necessary to convict. A good, 
short summary about the case can be found here: 

http://www.theglobeandmail.com/news/national/defence-prosecution-split-on 
-need-for-forensic-hair-testing/articlel367543/ 

The prosecution’s case relies strongly on the testimony of one witness, Leisa Maillard, 
who picked (a 2 year old picture of) Hay out of a photo lineup of 12 black men of similar 
age, and said she was 80% sure that he was the shooter. There were a number of other 
witnesses, none of whom identified Hay as one of the killers. Ms. Maillard’s testimony is 
weak in a number of ways (e.g. she failed to identify him in a lineup a week after the 
shooting, and at two trials when she picked out Gary Eunich instead), but here we will be 
concerned with only one of them: she described the unknown killer as having 2-inch “picky 
dreads,” whereas Hay had short-trimmed hair when he was arrested the morning after the 
murder. Thus, the police introduced the theory that Hay cut his hair during the night, 
between the murder and his arrest the following morning. In support of the theory, they 
offered as evidence a balled-up newspaper containing hair clippings that was found at the 
top of the garbage in the bathroom used by Hay. Their theory, in more detail, is that the 
known killer, Gary Eunich, cut Hay’s hair and beard during the night between the murder 
and the arrests, using the newspaper to catch the discarded hair, then emptied most 
of the discarded hair into the toilet; and crucially, a hundred-or-so short hair clippings 

1 Thanks to Joanne McLean and Deryck Ramcharitar for making the case files available to me. 
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Name in proof 

Max width (micrometers) 

Count 

bini 

0 to 112.5 

10 

bin 2 

112.5 to 137.5 

20 

bin 3 

137.5 to 162.5 

40 

bin 4 

162.5 to 187.5 

19 


Table 6.1: Measurements of 89 hairs found in a balled-up newspaper at the top of Hay’s 
bathroom garbage. Forensic experts on both sides agreed that the hairs in bin 3 and bhi 4 
are very likely beard hairs, and that the hairs in bini and bin 2 could be either beard or 
scalp hairs. 

remained stuck to the newspaper (Due perhaps to being lighter than the dreads? It was 
not explained why.). It is the origin of those hair clippings that we are concerned with in 
this argument; Hay has always said that the clippings were from a recent beard-only trim. 
If that is so, then the newspaper clippings are not at all inculpatory, and knowing this 
could very well have changed the jury’s verdict, since the clippings -as hard as this is to 
believe- were the main corroborating evidence in support of Ms. Maillard’s eye witness 
testimony. 

Both sides, defense and prosecution, agree that the newspaper clippings belong to 
Hay, and that either they originated from his beard and scalp (prosecution’s theory), or 
just his beard (defense’s theory). We will try to prove, from reasonable assumptions, that 
it is more likely that the hair clippings were the product of a beard-only trim than it is 
that they were the product of a beard and scalp trim. 

On 8 Nov 2013 the Supreme Court of Canada granted Hay a new trial in a unanimous 
decision, based on the new expert analysis of the hair clippings that we use in this argument. 
We do not yet know whether the Crown will choose to prosecute Hay again, or if they do, 
whether they will attempt to again use the hair clippings as evidence against liiim On 
28 Nov 2014, the Crown dropped its murder charges against Hay, declining to prosecute 
him again, and he was freed. As usual in these cases, there was no pronouncement of 
innocence, and Hay and his lawyers will have to fight for monetary compensation. 


6.1 High-level argument 

In 2002, the prosecution introduced the theory that Hay was the second gunman and must 
have had his dreads cut off and hair trimmed short during the night following the murder. 
It is clear that they did this to maintain the credibility of their main witness. In 2012, 
after the new forensic tests ordered by AIDWYC proved that at least most of the hairs 
found in Hay’s bathroom were (very likely) beard hairs, the prosecution changed their 
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Max width (micrometers) 

Count 

12.5 to 37.5 

3 

37.5 to 62.5 

28 

62.5 to 87.5 

41 

87.5 to 112.5 

17 

112.5 to 137.5 

1 


Table 6.2: Measurements of Hay’s scalp hairs obtained at the request of AIDWYC in 2010. 
Note that the first 4 bins are contained in bini from Table 1. Samples of Hay’s beard 
hairs were not taken and measured in 2010 because the forensic hair experts advised that 
beard hairs get thicker as a man ages. 

theory to accommodate, now hypothesizing that the hairs came from the combination 
of beard and scalp trims with the same electric razor, using the newspaper to catch the 
clipped hairs for both trims. Intuitively, that progression of theories is highly suspicious. 

On the other hand, perhaps the hairs did come from the combination of a beard 
and scalp trim, and the prosecution was simply careless in formulating their original 
theory. We cannot dismiss the newspaper hairs evidence just because we do not respect 
the reasoning and rhetoric employed by the prosecution. The argument below takes the 
prosecution’s latest theory seriously. At a high level, the argument has the following 
structure: 

1. There are many distinct theories of how the hypothesized beard and scalp trims 
could have happened. In the argument below, we introduce a family of such theories 
indexed by the parameters a m j n and a max . 

2. Most of the theories in that family are bad for the prosecution; they result in a 
model that predicts the data worse than the defense’s beard-trim-only theory. 

3. The prosecution cannot justify choosing from among just the theories that are good 
for them, or giving such theories greater weight. 

We will deduce how the parameters « min and « max must be set in order for the prosecution’s 
theory to have predictive power as good as the defense’s theory, and we will find that the 
parameters would need to be set to values that have no (supplied) reasonable justification 
(without referring to the measurements, which would be using the data to fit the model 
that the data is supposed to predict). If the assumptions from which we derive the 
parametric theory are reasonable (e.g. the fixed prior over distributions for Hay’s beard 
hair widths, and the fixed distribution for Hay’s scalp hair widths), then we can conclude 
that the newspaper hair evidence is not inculpatory. 

Though the argument to follow is unquestionably an example of Bayesian analysis, I 
prefer to use the language of frequencies and repeatable events rather than degrees of 
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belief. One could just as well use the language of degrees of belief, with only superficial 
changes to the axioms. 

We posit constraints on a randomized simulation model of the crime and evidence, 
which is applicable not just to Hay’s case, but also to a number of very-similar hypothetical 
cases (in some of which the suspect is guilty) taken from an implicitly-constrained 
distribution D. The probabilities are just parameters of the model, and in principle we 
judge models according to how often they make the correct prediction when a case is 
chosen at random from D. In the argument below, we don’t use D directly, but rather 
use a distribution over a small number of random variables that are meaningful in D , 
namely the joint distribution for the random variables: 

G, Clipped, Mix, BParams, H, Widths 

Some of the most significant assumptions for the argument are as follows: 

1. The prior chosen for the suspect’s beard hair-width distribution is fair and reasonable. 2 
This is Simplifying Assumption 4. It is probably the most disputable of the assumptions. 

I give some criticisms of it in Section 6.3. 

2. The distribution for the suspect’s scalp hair widths, based on the samples taken in 2010, 
is fair and reasonable (Simplifying Assumption 6). This assumption may be disputable 
in that it does not assume that Hay’s scalp hairs thinned by an average amount for a 
man of his age and race in the 8 years between the crime and when his scalp hair sample 
was taken. Of course, it may be that his hairs have not thinned at all. Unfortunately it 
appears that we cannot know this, as samples were not taken in 2002. 

3. The simulation model, on runs where the suspect is guilty (and thus the newspaper hair 
evidence comes from a combined beard and scalp trim), chooses uniformly at random 
(Simplifying Assumption 3) from a sufficiently large range the ratio 

P(random clipped hair came from beard, given only that it ended up in the newspaper) 

P(random clipped hair came from the scalp, given only that it ended up in the newspaper) 

( 61 ) 

Specifically that range is ] (but note: no axiom of the argument requires 

that symbolic form to be meaningful to the reader). The axioms enforce no constraints 
about <u m i n and <u max except for 0 < <a m i n < <a max < 1, but the hypotheses of Claims 5 
and 6 assert significant constraints; it turns out that in order for the likelihood ratio 

2 The reason we use a prior for the suspect’s beard hair width distribution is that Leighton Hay’s 
beard hair widths were never sampled; that decision was on the advice of one of the hair forensics experts, 
who said that a man’s beard hairs tend to get thicker as he ages. 
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p(widths-gj§) be ^ 1, the prosecution needs to make an extreme assumption about 
a m i n and ct max . Intuitively, assuming the suspect is guilty, both prosecution and defense 
are still very ignorant (before seeing the newspaper hair measurements) of how exactly 
the suspect trimmed his beard and scalp, e.g. in what order, how exactly he used the 
newspaper, and how exactly he emptied most of the clippings into the toilet, all of which 
would influence the above ratio (6.1). The hypotheses of Claims 5 and 6 formalize that 
intuition in different ways, which are close to equivalent, but nonetheless I think Claim 
6 is significantly easier to understand and accept. 

4. The suspect in the simulation model does not have an unusually low ratio of scalp hairs 
to beard hairs. This is Assumption 22. We can improve the current argument, if we 
wish, by having the simulation model choose that ratio from some prior distribution, 
and doing so actually results in a version of Claim 6 that is better for the defense. I 
don’t do this simply because the extra complexity would reduce the pedagogical value of 
this example. 

6.2 Argument 

Because this argument is written in LTgX. I present it more-informally than is required 
by the definition interpreted formal proof. In particular, I do not explicitly name the 
types of most symbols, and I don’t explain how exactly random variables and the 
P(proposition | proposition) syntax are formalized. 

I will often use the following basic facts. In the completely-formal proof they would be 
axioms in T assum that use only symbols in £ rigid , and thus should be accepted by any 
member in the intended audience of the proof. 

• For t\,t 2 ,t 3 boolean-valued terms: 


P (ti,t 2 | t 3 ) = P(ti | t 2 ,t 3 )P(t 2 | t 3 ) 


• For X a continuous random variable with conditional density function dx whose 
domain A is a polygonal subset of M n for some n: 

P(ti \t 2 ) = | P(H | t 2 ,X = x) d x (x | t 2 ) 

JxeS 

bini, biri2, biri3, biii4 are constants denoting the four micrometer-intervals from Table 
1. Formally, they belong to their own sort, which has exactly 4 elements in every model. 
We do not actually have micrometer intervals in the ontology of the proof, so we could 
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just as well use { 1 , 2, 3,4}, but I think that would be confusing later on. Bins is the sort 
{bini, bin 2 , bin 3 , bunt}. 

Throughout this writeup, b = bi ,..., bgg is a fixed ordering of the newspaper hair mea¬ 
surements shown in Table 1. Specifically, each 6, is one of the constants bini, bin 2 , bin 3 , 
or bin 4 ; bini appears 10 times, bin 2 20 times, bin 3 40 times, and bin 4 19 times. 

p abbreviates (pi,P 2 ,Ps)- 

p 4 abbreviates 1 — pi — p 2 — P 3 (except in Claim 8, as noted there also). 

G is the boolean simulation random variable that determines if the suspect in the current 
run is guilty. I write just G to abbreviate G = true and G to abbreviate G = false. 

Clipped is a simulation random variable whose value is determined by G. When G is 
false, Clipped is the set of beard hair fragments that fall from the suspect’s face when 
he does a full beard trim with an electric trimmer 3 several days before the murder took 
place. When G is true, Clipped is the set of beard and scalp hair fragments that fall from 
the suspect’s head when he does a full beard trim and a full scalp trim (the latter after 
cutting off his two-inch dreds) with the same electric trimmer. This includes any such 
fragments that were flushed down the sink or toilet, but not including -in the case that 
the suspect is guilty- hair fragments that were part of his 2-inch “picky dreads.” 

H is a simulation random variable whose distribution is the uniform distribution over 
Clipped, i.e. it is a random hair clipping. 

BParams is the simulation random variable that gives the parameters of the suspect’s 
beard hair width distribution. 

M ix is the simulation random variable that gives the the mixture parameter that deter- 
mine’s the prosecution’s newspaper hair width distribution given the beard and scalp hair 
width distributions. 

NOTATION: BParams and Mix will usually be hidden in order to de-clutter equations 
and to fit within the page width. Wherever you see p or (/q./q./q) where a boolean¬ 
valued term is expected, that is an abbreviation for BParams = p or BParams = (pi./VLu). 

3 The police collected an electric trimmer that was found, unhidden, in Hay’s bedside drawer, which 
Hay has always said he used for trimming his beard. 
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respectively. Similarly, I write just a as an abbreviation for Mix = a. 

B is the set from which our prior for the suspect’s beard hair width distribution is defined. 
It is the set of triples (jh,P2,P:i) e [ 0 , l ] 3 such that pi < P2,P3,P4 and ( Pi,P2,P3,P4 ) is 
unimodal when interpreted as a discrete distribution where Pi is the probability that the 
width of a hair randomly chosen from the suspect’s scalp (in 2002) falls in bin i. 

P(H | t 2 ) is the notation we use for the Bayesian/simulation distribution over the random 
variables G, Clipped, Mix, BParams, H, Widths, where t\ and £2 are terms taking on boolean 
values; it is the probability over runs of the simulation that t\ evaluates to true given that 
t 2 evaluates to true. 

Widths is the simulation random variable that gives the approximate widths (in terms of 
the 4 intervals bin.,) of the 89 hair clippings that end up in the balled-up newspaper. 

NOTATION: When the variables p and a appear unbound in an axiom, I mean for them 
to be implicitly quantified in the outermost position like so: \/p e B and Vo: e [« min , « max ]. 

When X is a continuous random variable with a density function, dx denotes that function. 


Definition 4. We are aiming to show that from reasonable assumptions, the follow¬ 
ing likelihood ratio is less than 1, meaning that the defense’s theory explains the 
newspaper hairs evidence at least as well as the prosecution’s theory. The notation 
likelihood-ratio(a m j n , a max ) is used just to highlight the dependence on the parameters 

^min? ^max- 


likelihood-ratio(a m j n , a max ) 


P(Widths = b | G) 
P(Widths = b | G) 


Assumption 15. The values of BParams and Mix are chosen independently of each other 
and G (whether or not the suspect is guilty). Hence the defense and prosecution have the 
same prior for the suspect’s beard hair width distribution. 

For t e {true, false}: 


d^BParams.Mix) (P) ® | G t) dBp arams (p) • d|\y|j x (cr) 

ctmin and G'max are constants in (0,1) such that cc m i n < a max - 
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Simplifying Assumption 3. The prior distribution for the mixture parameter Mix is 
the uniform distribution over [o: m i„, a max ] • 


C^Mix(tt) 


{ 1/ (cRnax CRnin) if CK G [f^min i ®max] 

0 otherwise 


Simplifying Assumption 4. The prior distribution for the parameters of the suspect’s 
beard hair width distribution is the uniform distribution over the set B [0, l] 3 defined 
above. 


dBParams(p) 


1/||B|| if p e B 
0 otherwise 


News(/i) =true iff the hair clipping h ends up in the balled-up newspaper. 

Beard(/i) =true (respectively Scalp(h) =true) iff hair clipping h came from the suspect’s 
beard (respectively scalp). 

Assumption 16. Both prosecution and defense agreed that all the hairs in the newspaper 
came from the suspect’s beard or scalp, and not both. 4 


Scalp(h) = -"■Beard(h) 

width is the function from Clipped to {bini, bin 2 , bin 3 , hup} such that width(h) is the 
interval in which the maximum-width of hair clipping h falls. 

Simplifying Assumption 5. In the simulation model, the hairs that ended up in the 
newspaper are chosen independently at random with replacement from some hair-width 
distributions. 

89 

P(Widths = 6 | G ,p,a) = jQ^P(width(H) = 6* | News(H), G,p, a) 

2 = 1 

89 

P(Widths = 5 | G ,p) = P(width(H) = 6* | News(H), G ,p) 

2 = 1 

Claim 4. We can write the width distribution of newspaper hairs in terms of the width 
distributions of beard and scalp hairs, together with the probability that a random 

4 “Not both” actually ignores the issue of sideburn hairs, whose widths can be intermediate between 
scalp and beard hair widths. Doing this is favourable for the prosecution. 
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newspaper hair is a beard hair. 

P(width(H) = bi | News(H), G,p, a) 

= P(width(H) = bi | Beard(H), News(H), G,p, a) P(Beard(H) | News(H), G,p, a) 

+ P(width(H) = bi | Scalp(H), News(H), G,p, a) P(Scalp(H) | News(H), G,p, a) 

Proof. Follows from Assumption 16. □ 

Assumption 17. In the defense’s model (not guilty G), all the newspaper hair came 
from a beard trim, and so the mixture parameter is irrelevant. 

P(width(H) =bi | News(H), G,p, a) 

= P(width(H) = bi | Beard(H), News(H), G,p) 

Assumption 18. Given that a clipped hair came from the suspect’s beard, the hair’s 
width is independent of whether the suspect is guilty in this run of the simulation. Thus 
the defense and prosecution models use the same distribution of hair widths for the 
suspect’s beard. 

P(width(H) = bi | Beard(H), News(H), G, a,p) 

= P(width(H) = | Beard(H), News(H), G, a,p) 

= P(width(H) = | Beard(H), News(H), 

Assumption 19. We finally give the precise meaning of the simulation’s mixture pa¬ 
rameter random variable Mix. It is the probability, when the suspect is guilty, that a 
randomly chosen hair clipping came from the suspects beard given that it ended up in 
the newspaper. 

a = P(Beard(H) | News(H), G,p, Mix = a) 

1 — a = P(Scalp(H) | News(H), G ,p, Mix = a) 

Assumption 20. The precise meaning of the simulation random variable BParams. Recall 
that P 4 abbreviates 1 — p\ — P 2 — pz- For j e {1, 2, 3, 4}: 

Pj = P(width(H) = binj | Beard(H), BParams = (p\,P 2 ,Pz) , News(H)) 

Simplifying Assumption 6. We use a completely-fixed distribution for the suspect’s 
scalp hair, namely the one that maximizes the probability of obtaining the hair sample 
measurements from Table 2 when 90 hairs are chosen independently and uniformly at 
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random from the suspect’s scalp. 


P(width(H) = bi | Scalp(H), G, a,p) 


89 /90 if i = 1 

< 1/90 if i = 2 

0 if i = 3,4 


The next axiom and claim give the main result, and the later Claim 6 is 
(almost) a corollary of Claim 5. 

Assumption 21. If ^ 1 (i.e. likelihood-ratio < 1), then 

(the newspaper hair evidence is neutral or exculpatory). 5 

Claim 5. If a ml „ =S .849 then Eggggg < 1 

The proof of Claim 5 is outlined formally below, after Claim 6. 


With the introduction of a new parameter and a mild assumption about its values 
(Assumption 22, the ratio on the left side being the new parameter), we will obtain a 
corollary of Claim 5 that is easier to interpret. 

We do not know what the ratio of beard to scalp hairs on Hay’s head was on the date 
of the murder, and it is not hard to see that a higher value of P(Beard(H) | G ,p,a) is 
favourable for the prosecution. 6 We do, however, know that the unknown shooter’s beard 
was described as “scraggly” and “patchy” by eye witnesses, and we have no reason to think 
that LH had a smaller than average number of scalp hairs. Thus it is a conservative 
approximation (from the perspective of the prosecution) to assume that Hay had a great 
quantity of beard hairs for a man (40,000), and an average quantity of scalp hairs for a 
man with black hair (110,000). 7 Thus we assume: 

Assumption 22. 

P(Beard(H) | G ,p,a) < 

P(Scalp(H) | G,p,a) " 1 

Claim 6. The hypothesis of Assumption 21 also follows if we assume Assumption 22 and 
that the uniform prior over Mix gives positive density to a model where a random clipped 
beard hair is ^ 15 times more likely to end up in the newspaper as a random clipped scalp 
hair. 

5 The text in brackets is a constant predicate symbol. 

6 Raising the value makes both models worse, but it hurts the prosecution’s model less since the 
prosecution’s model can accommodate by lowering <a m i n and <a max . 

trustworthy sources for these numbers are hard to find. 40,000 is just the largest figure I found 
amongst untrustworthy sources, and 110,000 is a figure that appears in a number of untrustworthy 
sources. If this troubles you, consider the ratio a parameter whose upper bound we can argue about later. 
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If there exists a e [o: m ; n , o max ] and p e B such that 

P(News(H) | Beard(H), G,p, a) 
P(News(H) | Scalp(H), G,p, a) 

then 

P(Widths = 6 | G) ^ ( 
P(Widths = 6 | G) 

Proof. Let a,p be as in the hypothesis. 

From basic rules about conditional probabilities: 


a P(Beard(H) 

| News(H), G ,p, a) P(News(H) 

Beard(H)G,p, a) P(Beard(H) 

1 G ,p,a) 

1 — a P(Scalp(H) 

News(H), G,p, a) ~ P(News(H) 

Scalp(H), G,p, a) P(Scalp(H) 

G ,p,a) 


( 6 . 2 ) 

Using the inequality from the hypothesis and Assumption 22, solve for a in (6.2). This 
gives ol < 0.84507. Since a m ; n < a we have o m i n < .84507, so we can use Claim 5 to 
conclude that the likelihood ratio is less than 1. □ 

Simplifying Assumption 7 (hypothesis of Claim 6). There exists a e [« m m,ciWx] and 
p e B such that 

P(News(H) | Beard(H), G,p, a) ^ 

P(News(H) | Scalp(H), G,p, a) ^ 


Goal Sentence 1 . (the newspaper hair evidence is neutral or exculpatory) 

Proof. From Simplifying Assumption 7, Claim 6, and Assumption 21. □ 

Proof of Claim 5 

Note: there is nothing very interesting about this proof; it is basically just a guide for 
computing the likelihood-ratio as a function of a m i n , a max . 

To compute the integrals, I will break up the polygonal region B into several pieces which 
are easier to handle with normal Riemann integration over real intervals. 

Let Bi be the subset of B where p 2 > P 3 P 4 
B 2 the subset of B where p% > p 2 > Pi 

B 3 the subset of B where /q > p\ P p 2 

B 4 the subset of B where p 4 > p 3 ^ p 2 

Claim 7. B is the disjoint union of Bi, B 2 , B 3 , B 4 . 
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Claim 8 . In the scope of this claim, p 4 is a normal variable, not an abbreviation for 
1 - Pi -P2-P3- 


1 ~ p l 1 ~P1~P4 
V 4 3 - 


J ^(Pi,P2,P3, l-pi~P2~P3)dp = J J J t(pi,l-pi-p 3 -p 4 ,p 3 ,p' /l )dpidp 4 dp 3 


P=(pi,P2 ,P3>eBi 


Pl=0p4=Pl P3 =P4 


1 -PI 1 —Pi—P 4 
V 4 3 - 


J t(Pl,P2,P3, l-Pl~P2-P3)dp = J J J t(p 1 ,p 2 ,l-pi-p2-p4,Ppdp 1 dp 4 dp 2 


P=(pi,P2,P3>eB 2 


Pl=0p4=Pl P2=P4 


!-Pl 1-P1-P2 

V 4 ~ 3 ~ 


J ^(Pl,P2,P3, l-Pl-P2-P3)c?P = J J J f(pi,p 2 , l~Pl-P2-P4,P4)dpidp 2 dp 4 


P=(pi,P2 ,P3>eB 3 


Pi —0 p 2 —Pi P4 —P2 


1/, 1 ~P1 1 ~P1~P2 

V 4 3 


J t(Pi,P2,P3, l~Pi-P2-P3)dp = J J J t(p 1 ,p 2 ,p 3 ,l-pi^P2-P3)dpidp 2 dp 3 

P=<Pl,P2,P3>6B 4 

Claim 9. ||B|| = 1/36 


Pl=0p2=Pl P3=P2 


Proof. The measure of B ? can be computed by standard means by substituting 1 in for 
f(...) in the right side of the j-th equation of Claim 8 . We find that ||Bi|| = ||B 2 || = 
HB 3 1| = HB 4 1| = 1/144. Hence ||B|| = 1/36 follows from Claim 7. □ 


Claim 10 . Simplified forms amenable to efficient computation: 


P(Widths = 6 | G, (pi,p 2 ,P 3 )) = pfpfpfpf 


P(Widths = 6 | G, (p 1 ,p 2 ,p 3 ) ,a) = (pia + 89 / 9 o(l-o:)) 10 (p 2 « + 1 / 9 o(l-Q:)) 20 (p 3 «) 40 (P 4 a ) 19 

Proof. The first equation follows easily from Simplifying Assumption 5 and Assumption 
20. The second follows easily from Simplifying Assumption 5, Axioms 20 and 19, and 
Claim 4. □ 


From the next fact and Claim 8 we can compute the two terms of the likelihood ratio for 


fixed G: m m and ax • 


Claim 11. 


P(Widths = 6 | G) 



Ct£[ct m in j^max] pCB 


P(Widths = 6 | G,p,a)d( B Params,Mix>(P,« | G) 
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(cRnax ®min)||B| 


y | | P(Widths = 6 | G,p, a) 


ie{ 1,2,3,4} 


; OiE [c^min jQimax] 

P(Widths = 6 | G) = J P(Widths = 6| G,p) d BParams (p\ G) 


pE B 


1^1 2 f P(Widths = 6|G ,p) 

II II Ar- ( 1 O Q A 1 


i6{l,2,3,4} 


pfeBj 


Proof. The first equation follows just from p,a > P(Widths = b | G ,p,a) being an 
integrable function and d(BParams,Mix) (p,« | G) being the conditional density function for 
(Mix, BParams) given G = true. 

The second equation follows from Claim 7, Simplifying Assumptions 3 and 4, and the fact 
that p,a P(Widths = 6 | G ,p, a) is bounded. The first and fourth of those facts suffice 
to show that the integral over B is equal to the sum of the integrals over the sets B r 
Justifications for the third and fourth equations are similar to those for the first and 
second. □ 


As of now I’ve mostly used Mathematica’s numeric integration, which doesn’t provide 
error bounds, to evaluate the intervals, but there are also software packages one can use 
that provide error bounds. 

The likelihood ratio (Definition 4) achieves its maximum of sa 1.27 when a min and a max 
are practically equal (unsurprising, as that allows the prosecution model to choose the 
best mixture parameter) and around .935; Plot 6.2 illustrates this, showing the likelihood 
ratio as a function of ct m i n when a max — «mm = 10 -6 . To prove Claim 5 we need to look 
at parameterizations of « m i n ,o: max similar to the one depicted in Plot 6.2, which shows 
the likelihood ratio as a function of « max when « m i n = .849 (the extreme point in the 
hypothesis of Claim 5), in which case the likelihood ratio is maximized at % .996 when 
QWx = 1. In general, for smaller fixed 0 : min , the quantity 

max (likelihood-ratio(a m ; n , a max )) 

^max^fomin ,1) 

decreases as a m \ n does. More precisely, Claim 5 follows from the following three proposi¬ 
tions in Claim 12. The first has been tested using Mathematica’s numerical integration; 
if it is false, it is unlikely to be false by a wide margin (i.e. taking a value slightly smaller 
than .849 should suffice). The remaining two have also not been proved, but one can gain 
good confidence in them by testing plots similar to Figure 6.2 for values of ct m i n < .849. 
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Proving Claim 12 or a slightly weaker version of it is just a matter of spending more time 
on it (or enlisting the help of an expert to do it quickly). But we will see in the next 
section that the argument is more-vulnerable to attack in other ways. 

Claim 12 . Here, the notation likelihood-ratio(ai, « 2 ) is short for “the real number taken 
on by the defined term likelihood-ratio when a min = ctq and a max = « 2 -” 

1. likelihood-ratio(. 849,1) < .997 

2. For aq < .849 have likelihood-ratio(ai , 1) < likelihood-ratio(. 849,1) 

3. For aq < .849 and aq < «2 < 1 have likelihood-ratio(ai, 012 ) < likelihood-ratio(ai, 1) 


Figure 6.1: Likelihood ratio as a function of o ni j n when « max — 0 ' min = 10 6 , obtained by 
numerical integration. 



6.3 Criticism of argument 

6.3.1 Criticism 1 

It is arguable that the prior for the suspect’s beard hair width distribution is slightly 
biased in favor of the defense, in which case the prosecution could reject Simplifying 
Assumption 4. In particular, the average value of the component of BParams for bini, 
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Figure 6.2: Likelihood ratio as a function of « max when o niin = .849, obtained by numerical 
integration. The shape of this plot is similar for smaller values of ct m i n , being maximized 
when a max = 1, which is what parts 2 and 3 of Claim 12 express. 



the bin corresponding to the thinnest hairs, is 0.0625. 8 It is best for the defense when 
the value of that component is 11/89, and best for the prosecution when it is 0, so the 
prosecution could reasonably insist that a prior is not fair unless the average is at most 
the mean of those two extremes, which is % 0.0618. 

We can raise this criticism in a disciplined way, for example by suggesting an axiom 
that expresses the above; if x is the value of p\ that maximizes the probability of the 
evidence given G = true, and y is the value of the p\ that maximizes the probability of 
the evidence given G = false, then \p eB Pi T {x + y)/2 . 

The defense can respond to the criticism, and I will explain how in Section 6.3.3. 
Doing so requires slightly strengthening the hypotheses of Claims 5 and 6. 

6.3.2 Criticism 2 

The second criticism says that the prior for BParams is unreasonable, with respect to 
measurements of beard hair widths of black men in the literature, in that it never yields a 
beard hair width distribution that has hairs of width greater than 187.5 micrometers. In 


Compute by substituting p\ in for t in each of the four equations of Claim 8, and sum the results. 
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terms of the argument, the critic should reject the (implicit) axioms that constitute the 
types of the symbols width and Widths; according to the semantics of those symbols, their 
types assert that all the hairs in Leighton Hay’s beard and scalp had thickness at most 
187.5 micrometers, which is unjustified. Formally, according to Section 2.2.1, one way to 
do this would be for the critic to suggest new definitions of Bins, width, and Widths. The 
critic can do this by suggesting new axioms (some of which are type constraints). Most 
importantly, the critic should suggest redefining the sort Bins as {bini,... ,bin 5 }, where 
bin 5 is a new constant. The results of that approach are discussed in the next section. 

6.3.3 Response to criticisms 

We can address both criticisms at once; if we introduce a fifth component of BParams 
corresponding to the interval (187.5, go), and like the first component (probability width 
is in bini) of BParams constrain it to be less than the middle three components (for 
bin 2 , bin 3 , bin,*), then the average value of the bini component of BParams goes down 
to < .057. We then need to slightly strengthen the hypotheses of the two main claims, 
changing the parameter .85 in Claim 5 to .835 and the parameter 15 in Claim 6 to 13.9. 
Then, the proof works as before. 

6.3.4 An open problem 

Though I do not have such a criticism in mind, the prosecution could potentially argue 
that the prior for Hay’s beard hair distribution is still biased, in the sense that it does 
not take into account everything we know about the beard hair width distributions of 
young black men or Hay himself, say by referring to literature such as [TCFK83] (cited in 
the documents submitted by expert witnesses from both sides of the trial), or by taking 
samples of Hay’s current beard hair width distribution and somehow adjusting for the 
increase in width that expert witnesses said is likely, since Hay was only 19 at the time of 
the murder. Or they could criticize my choice of prior by claiming that it assumes too 
much . 9 

Given that, an ideal proof would have the following form. We would first come up with 
some relation R over priors for 5-bin distributions, such that R(f ) expresses as well as 
possible (given the constraint of having to complete the proof of the following proposition) 

9 Although I expect that would be a bad idea. For example, I found that if we take the prior to be the 
completely uniform prior over finite distributions for 5 bins, then the results are significantly worse for 
the prosecution. 
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that / is “fair and reasonable” . Then, we would find the largest constant a 0 e (0,1) such 
that we can prove: 

For any / e R, if / is used as the prior for the suspect’s beard hair width 
distribution, and a m j n < qy,, then likelihood-ratio < 1 

The same goes for Hay’s scalp hair width distribution; it would be better to have a 
broader set of distributions that an adversary can choose from. At the very least, the 
argument should accommodate the possibility that Hay’s scalp hairs have thinned over 
time, in which case we would make use of the fact that Hay is not balding (male pattern 
balding makes hair follicles, and the hairs they produce, gradually thinner, until the hair 
follicle is blocked completely). 



Chapter 7 

Example: Arguing that smoking causes 
cancer in 1950 


In 1950 two landmark papers were published giving some of the first strong statistical 
evidence in the English-speaking world [SE05] 1 that tobacco smoking causes cancer, the 
first in the United States[WG85] and the second in England [DH50]. Yet it was not until 
1965 that cigarette packages were required to have health warnings in the United States. 
Michael J. Thun, in his article When truth is unwelcome: the first reports on smoking and 
lung cancer, argued that 15 years was much too long given the strength of those studies: 

In retrospect, the strength of the association in the two largest and most 
influential of these studies - by Ernest Wynder & Evarts Graham in the 
Journal of the American Medical Association (JAMA). .. and by Richard Doll 
& Austin Bradford Hill (both of whom were later knighted for their work) 
in the British Medical Journal- should have been sufficient to evoke a much 
stronger and more immediate response than the one that actually occurred. 

Had the methods for calculating and interpreting odds ratios been available 
at the time, the British study would have reported a relative risk of 14 in 
cigarette smokers compared with never-smokers, and the American study a 
relative risk of nearly 7, 2 too high to be dismissed as bias. [Thu05] 

I will give part of an argument here that the health warnings policy was well-justified 
already in the early 1950s. The full argument involves introducing two more candidate 

1 In Smith and Egger’s short letter to the editors of the Bulletin of the World Health Organization[SE05], 
they give a very interesting account of how the history of this scientific progress is poorly known. In fact 
there were already reviews of the literature on the connection between smoking and lung cancer as early 
as 1929! Even the theory of second-hand smoking is at least as old as 1928. 

2 Note that these relative risk calculations treat the two studies separately, whereas both versions of 
the argument in this chapter use the earlier study to fit a model for the later study. 
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models (see below), the cigarette companies’ unknown genotype model and the statistician 
R.A. Fisher’s soothing herb model. How to refute those models is discussed in the next 
section. The part of the argument given here simply compares a weak version dependModel 
of the standard, causal model, to the naive null-hypothesis model indepModel, which 
posits that smoking and lung cancer are independent. I call dependModel the “dependent- 
variables” model, since it doesn’t actually formalize why it predicts that smoking and 
cancer are dependent variables. 

This argument is an instance of the following setup: An experiment to measure 
some variable is designed and published, with the possible outcomes of the experiment 
(values of the variable) defined precisely. Sufficient time is given for all the interested 
parties to publish competing models for predicting the outcome of the experiment, by 
giving probability distributions over the set of possible outcomes. The experiment is 
performed. Suppose that one of the models M is "overwhelmingly better" (defined in the 
experimental design - below, via the definition of Beats(-, •) and Axiom 28) at predicting 
the true outcome (or an outcome near the true one) than the others. Moreover, suppose 
that M predicts that the use of a certain product may pose a health risk to its users; below, 
this is productWarning(M). Then the result of this competition must be communicated 
to potential users of the product. The warning can be revoked if M loses in a later equally 
rigorous experiment competition. 

The purpose of this example is, in part, to demonstrate that the requirement of 
deductive reasoning is not a limitation for problems in the domain I specified (Section 
2 .1) 3 , provided at least that one is firmly committed to certain ideals of persuasion. 

Two versions of the argument are given, both dependent on mathematical claims that 
are unproved, but easily testable, and very likely easily resolvable by an appropriate expert 
(see footnote 1 on page 1 about unproved purely-mathematical claims). The version in 
Section 7.3 is simpler and more complete than the version in Section 7.2, but also weaker 
in that it uses a more idealized, less accurate model of the experiment. 

7.1 Extensions and refinements of the argument 

In the argument below, the causal scientific model, which motivates the assumptions made 
by dependModel, is not made explicit. With the addition of Fisher’s soothing herb model 
and the tobacco companies’ unknown genotype model (i.e. adding those models to the 
set A11CM), it would be necessary to make candidate models derive their experimental 

3 This example does not today meet the second criteria (contentiousness) that I listed there, but it did 
in the 1950s. 
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outcome distributions from more-qualitative assumptions. The reason is that those models 
are contrived to fit the data; they have outcome distributions similar to depend Model’s, 
seemingly (but not provably!) just to prevent dependModel from winning on purely 
quantitative grounds, as it does against indepModel. Hence it is necessary to have a test 
that at least requires that a model’s outcome distribution is derived from some more- 
readily-understandable axioms. In Fisher’s model, the readily-understandable axioms 
essentially say that lung cancer causes smoking. In the unknown genotype model, they 
say that there is a common genetic cause of both lung cancer and a person’s propensity 
to smoke tobacco. The easiest way to refute those models is to incorporate the data 
on female smoking and cancer, which neither model is able to explain without making 
them more elaborate. 4 In fact, the argument could be strengthened in either or both of 
two ways: make models derive their experimental outcome distribution from qualitative 
assumptions, or make their experimental outcome distributions explain more data. I 
would advocate both. 

Another, more technical and subtle way of improving the argument is to elaborate the 
definition of Beats(Mi, M 2 ) in such a way that, in effect, instantiations of the parameters 
of the two models Mi and M 2 are only compared if they agree on the number of smokers in 
the British population. This would prevent one model from having an advantage over the 
other simply by having a better estimate of the total number of smokers (which, intuitively, 
we don’t care about). Unfortunately, it would also very likely make the suitably-modified 
versions of Conjectures 1 and 2 harder to prove. 

7.2 Proof with hypergeometric distributions contingent 
on an unproved mathematical claim 

Vaguely-defined sorts (in £ vague ) 

• CM : candidate models for the possible outcomes of the British study [DH50]. In the cur¬ 
rent version of this argument, a candidate model M is determined by outcomeDistr(M, •) 
and productWarning(M). 

4 Smoking became popular among men years before it became popular among women, and the lung 
cancer rates reflect this. The unknown genotype model could explain the earlier, smaller rates of lung 
cancer and smoking among women by suggesting a sex-linked genotype; however, they would not be able 
to explain why the rates increased so quickly. As for Fisher’s soothing herb model (lung cancer causes 
smoking, because of the soothing effect of smoking), it would require an additional hypothesis, unrelated 
to the purported soothing effect, to explain why there was a delay in the increase of female lung cancer 
rates. 
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• A : set of adult men living in the US at the time when the American study [WG85] was 
done. 

• B : set of adult men living in England at the time when the British study was done. 

Sharply-definable sorts (in £ rigid ) 

• K and N - reals and natural numbers. 

• E 1 and Id 1 reals and naturals, but each with an extra element for “undefined”, to serve 
as the range of division and subtraction. Intuitively structures should make these be 
supersets of E and N, but technically all the sorts are disjoint. For readability I will not 
display the unary function symbols that are sometimes necessary to convert between 
the two sorts. 

• FS[o] - finite subsets of (the interpretation of) the given sort a. This is a sort operator, 
i.e. a function from sorts to sorts. 

• Fn[«,/3] - the functions from a to /3, another sort operator. 

• Str - strings over the ASCII alphabet 

• StudyOutcomes (informally a “subsort” or “subtype” of FS[N]) - the set {620,..., 649}. 

Before the study is done, we don’t know how many of the people with lung cancer are 
smokers, i.e. |LC^ mp n S^ mp | is unknown. The size of that set is smallest when every 
person without lung cancer is a smoker, and largest when every person with lung cancer 
is a smoker, so the set of outcomes of the study (the possible sizes of LCg mp n S^ mp ) 
is {|S s B amp | - ..., |LC s B amp |} = {620,..., 649}. 

Function symbols in £ vague 

In the following, a person being a “smoker” means that they smoked at least one cigarette 

per day during the most-recent period when they smoked. 

• B pop : FS[5] is a hypothetical set ; the population that we imagine the British study 
samples were drawn from. 

• LC^ op : FS[A] is the set of people in B pop with lung cancer. 

• LCg° P : FS[A] is the set of people in A pop without lung cancer. 

• S pop : FS[5] is indepModel’s guess at the set of smokers in B pop . 

• S pop : FS[B] is depend Model’s guess at the set of smokers in T> pop . 

• ^jsamp^ £sam P . pg[^ 4 ] j s sample of patients used in the American (resp. British) 
study. 

• LCj}' mp , LC^ mp : FS[/4] is the set of people in A samp (resp 7> samp ) who have lung cancer. 
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• S s A amp , S s B amp : FS[A] is the set of smokers in A samp (resp F> samp ). 

• outcomeDistr(-, •) : CM x Study Outcomes —* M is the given candidate model’s distri¬ 
bution over StudyOutcomes. 

• A11CM : FS[CM] - the set of all candidate models. It should contain a candidate model 
from every interested party. 

Defined function symbols (in £ def ) 

• StudyOutcomes : FS[N] := {620,..., 649}. A copy of the sort StudyOutcomes (see 
above for definition) that resides in the universe. So StudyOutcomes denotes both (1) 
a sort, and (2) an element of the universe defined to be the set that is the intended 
interpretation of (1). 

• Constants for the complements of some sets: 

• For each symbol X e {LC^ mp , S^ mp }: X ■= A samp \X 

• For each symbol X e {LC{T P . S s B amp }: X ■= B samp \X 

• For each symbol X e {LC^ op , S pop , S£°?}: X ■■= B pop \X 

• Pr xeU (x eV 1 \xeV 2 ): FS[a] x FS[a] x FS[a] -> ? M := \Vi n V 2 n U\/\V 2 n U\ 

• For each k e {0,1, 2}: 

testlntervafi : FS[StudyOutcomes] := {|S^ mp n LC^ mp | — k,, |S^ mp n LC^ mp | + k} 

• For each k e {0,1, 2}: 

x —max (test Interval^) 

testfc(M) : CM —* M := 2 outcomeDistr(M, x) 

x —min (testInterval^) 

Predicate symbols in £ def 

• Beats(Mi:CM, M 2 :CM) <-»• Afce{o l 2 } t es tfc(A/i) > 1000 • test*,(M 2 ). Model Mi beats 
model M 2 if it assigns much higher probability to the true outcome |S^ mp n LC^ mp |, 
as well as to the intervals of size 3 and 5 around the true outcome. The interval of size 
5 is about 17% of StudyOutcomes, and any larger interval would be biased since the 
interval of size 5 already contains the maximum of StudyOutcomes. 

• BeatsAll(Mi:CM) <-> VM 2 :CM.(M 2 e A11CM a Mi ^ M 2 ) => Beats (Mi, M 2 ) simply 
says that Mi beats all the other models in A11CM. 

Function symbols in £ rig j d 

• {x,... ,y} : Nxff-> FS[N] is the set of naturals from x to y inclusive, or the empty 
set if x > y. 

• +,-:NxN->N are addition and multiplication for N. 
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• +,':RxR^R addition and multiplication for E. These symbols are distinct from 
the ones for N, but display in the same way. 

• - :NxN^ N 1 is subtraction, but undefined if the result is negative. 

• / : E x E —» M 1 is division, undefined when the second argument is 0. 

• apply : Fn[N, E] x N —> E is the application of a function object to an argument. There 
are versions of this symbol for a few other types as well. 

• E^ 2 =tl f 3 (x) : N x N x Fn[N. M] —» E is the usual summation binder symbol. The formal 
syntax is E(U, t 2 , Xx\N.t 3 ), where “Ax:N,f 3 ” is actually just the name of a constant 
symbol of sort Fn[FI, E] that is implicitly defined in terms of the open term t 3 and the 
function symbol apply, but I’ve hidden those definitions from this writeup for readability. 

• n : FS[«] x FS[o:] —> FS[tt] is set intersection. As with the other function symbols 
in this list whose type is presented with a sort variable a, there are multiple distinct 
function symbols, for various instantiations of a, that each display as n. In the HTML 
presentation of interpreted formal proofs, one can disambiguate the symbol by hovering 
over it to see its type. 

• \ : FS[«] x FS[o] —> FS[«] is set difference. 

• | | : FS[a] —* N is the size of the given finite subset of (the interpretation of) a. 

• (^) : FS[o:] xN-» FS[FS[o:]] is the set of subsets of X of size k. 

• min(-), max(-) : FS[N] —* N 1 are the minimum and maximum elements of a finite set 
of naturals. Undefined if the set is empty. 

• hyper (k, s, N, s') :NxNxNxN-> E x is the hypergeometric distribution (in the 
last argument; the other three arguments are parameters), defined when s' A s ^ 
N, s < k < N; if a population of size N has s smokers and N — s nonsmokers, and k 
people are chosen uniformly at random without replacement from the population, then 
hyper (k, s, N, s') is the probability that the resulting set contains exactly s' smokers. 

• condHyper(si, s 2 , X\, X 2 , sj) : N x N x FS[H] x FS[H] x N —> E x is a probability 

distribution (in the last argument; the other four arguments are parameters), defined 
when .sj < ^ |S^" mp | < N, si < |Ai|, s 2 A |A 2 |. Suppose we have disjoint 

sets of people X\ and A 2 , with X\ having Si smokers and X 2 having s 2 smokers. 
Uniformly at random we choose size-|LC^ mp | subsets X[ of Xi and X’ 2 of A 2 . Then 
condHyper(si, s 2 , Xi, A 2 , sj) is the conditional probability that X[ contains exactly sj 
smokers, given that there are |S^ mp | smokers in X[ u X' 2 . 



Chapter 7. Example: Arguing that smoking causes cancer in 1950 


81 


Simplifying Assumption 8. We would change this to a normal Assumption if we 
included formalizations of Fisher’s and the tobacco companies’ models also (see section 
7.1 above). 

A11CM = (indepModel, dependModel} 

Quasi-Definition 3. Sizes of sets from the American study. 

|LC^ mp | = 780 patients in sample with conditions other than cancer 


j^Qsamp 

= 605 

patients in sample with lung cancer 

gsamp ^ j^Qsamp 

= 114 

nonsmokers with conditions other than cancer 


|S^ amp n LC^ mp | = 8 nonsmokers with lung cancer 

Quasi-Definition 4. 

productWarning(dependModel) = “Scientific studies have found a correlation between 
tobacco smoking and lung cancer that is currently best-explained by the hypothesis that 
smoking causes an increase in the probability that any person will get lung cancer. ” 

productWarning(indepModel) = (the empty string) 

Quasi-Definition 5. This gives the sizes of the sample sets, and certain subsets of 
those sets, from the British study. We evaluate the different models on how well they 
predict the size of LC^ mp n S^ mp , given the sizes of LC^ mp , LC^ mp , and S^ mp . A model 
predicts the size well if its distribution over StudyOutcomes assigns high probability to 
|LCg mp n S)T ip or some close number; this is formalized in the definition of Beats(-, •). 


|LC s B amp | = LC^ mp | 

= 649 

gsamp 

= 1269 

j^Qsamp gsamp 

1 ^ B B \ 

= 647 

|LC s B amp n S s B amp | 

= 622 


Simplifying Assumption 9 (dependModel posits a hypergeometric distribution). Note 
that the values of the four parameters in 

condHyper(|LC^ op n SgJ|, |LCb° P n S p ° p |, LC pop , LC^° P , •) are only bounded by the other 
axioms, especially Assumptions (23), (24), (25), and (26), with the latter two distinguishing 
dependModel’s distribution from indepModel’s. Still, this and Simplifying Assumption 10 
are the weakest of the axioms with respect to the standards of accuracy that I strive 
for. Unlike the others, we cannot seriously claim that this axiom is literally true with 
respect to the informal intended semantics given by the language interpretation guide, 
simply because the authors of the British study did not methodically randomize the 
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way that they chose their sample sets of men with and without lung cancer. I would be 
satisfied to have an axiom that says outcomeDistr(depend Model, •) is “close enough” to a 
hypergeometric distribution, but I have not yet investigated suitable ways of formalizing 
“close enough,” and it is not clear that there would be a benefit in pedagogy or cogency 
that warrants the added complexity. 

Vs:StudyOutcomes. outcomeDistr(dependModel, s) 

= condHyper(|LC pop n S p ° p |, |LC P ° P n S p ° p |, LC pop , LC P ° P ,s) 

Simplifying Assumption 10 (indepModel posits a hypergeometric distribution). Note 
that the values of the four parameters in 

condHyper(|LC^ op n S^ op |, |LC^° P n S^ op |, LCjg° p , LC^° P , •) are only constrainted by the 
other axioms, especially Assumptions (23), (24), and (27), with Assumption (27) distin¬ 
guishing indepModel’s distribution from dependModel’s. 

Vs:StudyOutcomes. outcomeDistr(indepModel, s ) 

= condHyper(|LC pop n S p ° p |, |LCT n S^|,LCg» LC™ a) 

Assumption 23. This is a conservative axiom for dependModel; a figure from the British 
study says that the rate of lung cancer in men was 10.6 per 100,000 in 1936-1939, and 
population data for England in 1951 puts the population at about 38.7 million, hence 
even if the population from which the British sample was drawn is taken to be the entire 
nation, if we assume about half the population was male, and that the rate at most 
trippled from 1939 to 1950, then we should expect at most 6100 men with lung cancer. 

|LC 1 g >p | < 7000 

Assumption 24. This is a conservative axiom for dependModel; it says that of the 
hospital patients from which the British scientists drew their sample, at most 1 in 6 had 
lung cancer (in reality it would have been significantly lower). 

|LC P ° P | > 5 * |LC pop | 

Assumption 25. Consider the ratio of probabilities of being a British nonsmoker given 
that you have lung cancer vs. given that you have a hospitalizable illness other than lung 
cancer. This axiom says that it is not much smaller than the corresponding ratio seen 
in the American study sample, specifically not more than 3 times smaller. If we were 
to define a best-guess version of the dependent-variables model dependModel, we would 
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set the (unknown) left side of the below inequality equal to the (known) right side (and 
similarly for Assumption (26)), in effect positing that the correlation between smoking 
and lung cancer in the British population is identical to the correlation in the American 
sample. However, the evidence is so strongly in favor of dependModel that this much 
weaker assumption suffices: 

6 Sfg I x e LCg*) ^ „ Pv, e „{x | K LCr") 5 

Pr x6 £ P op(a; g Sg° p | x e LCjg° p ) Pr xey 4sam P (x g S^ mp | x g LC^ mp ) 

Assumption 26. Same comment as in Assumption 25 applies here. 

Pr xe Bpop(x g S pop | x g LCg° p ) ^ 1/3 • Pr^samp (x g S^ mp | x e LC^ mp ) 5 6 

Assumption 27. The independent-variables model simply posits that, in the population 
from which the British sample was drawn, the fraction of smokers among people with 
lung cancer is the same as the fraction of smokers among people with illnesses other than 
lung cancer. 


P r xeBP°p(% e S'/° P | X G LC^° P ) — Pr,, e Bpop (x G S'/)/ 


x g LC pop ) 


The next assumption states the intended consequence of one model beating all the others. 
Assumption 28. VM:CM. BeatsAll(M) => ShouldRequire(productWarning(M)) 

Claim 13. 

condHyper(si, s 2 , X u X 2 , si) 


equals 


hyper(|LC s B amp |,A, l*i|,sj);hyper(|LCg mP Uu \X 2 \, |S^ mp | - sj) 

max(StudyOutcomes) _ 

S hyper(|LC^ mp |,si, |ATi|,x) • hyper(|LC s ^ mp |, s b \X 2 \, |S^ mp | - x) 

£=min(StudyOutcomes) 

Proof. This is a standard definition of the conditional hypergeometric distribution. An 
informal proof is easy from the informal semantics given for condHyper(-, •,•,•,•). Note 
that we could alternatively have made condHyper/, •, •, •, •) a dehned function symbol. □ 

The above axioms, together with some basic mathematical axioms, prove that for any 
setting of the free parameters |LCg° p |, |LC^ op |, |S^ op n LC^ op |, |S^ op n LC^ op |, etc that 


5 = 3(.0132231/. 146154) « .27142 

6 = (1/3).146154 = .048718 
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obeys the constraints given by Axioms (23)-(27), the dependent-variables model decisively 
beats the independent-variables model: 

Conjecture 1. 

testfc(dependModel) > 5000 • testfc(indepModel) 

fce{0,l,2} 

From Conjecture 1, the goal sentence follows: 

ShouldRequir e (pro duct Warning (d e pendModel)) 


7.3 Simpler, more-easily completable proof 

We add a symbol for the binomial distribution. 

binDistr v (-) : [0, 1] x N x N —> N x 

We may either give binDistr v (-) a prose dehnition, and then state the next sentence as 
a Claim, or we could make binDistr v (-) a defined function symbol defined by the next 
sentence. Either way is consistent with the definition of interpreted formal proof. In the 
proof from the previous section, the former option was taken. In this section I will leave 
it ambiguous. 

v p;( 0,1 ).VM:N. binD^W - (”)p‘(1 - P)~ 

We introduce a family of probability distributions that takes the place of condHyper(-, 

We give it the following prose dehnition, and the later two axioms, Claims (14) and (15), 
are made only for the purpose of calculation. 

• condBinom(pi,p 2) s \) : [0,1] x [0,1] x N —> R is a probability distribution (in the last 
argument; the other two arguments are parameters). Suppose we sample (with replace¬ 
ment) |LC^ mp | times from each of two binomial distribution, the first having success 
probability pi and the second having success probability p 2 . Then condBinom(pi,p 2 , ) 
is the conditional probability that we get sj successes from the hrst distribution given 
that the sum of successes is |S^ mp |. 7 

We also introduce three new constants Ps d |LCiTs d |LC> an d Ps, : |* of type [0,1]. Ps d |LC 
and Ps d |LC are dependModel’s estimates of the fraction of smokers in the lung cancer 


7 Note that this family of distributions is usually given with |LC^ mp | and |S^ mp | as parameters 
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population and in the population of people with conditions other than lung cancer, pg.|* 
is indepModel’s estimate of the fraction of smokers in both populations. 

We drop Simplifying Assumptions (9) and (10), replacing them with the following two: 

Simplifying Assumption 11 (dependModel posits a binomial distribution). Note that 
the values of the two parameters (first two arguments) of condBinom(-, •, •) are only 
bounded by the axioms from the proof in the previous section and Assumptions (29), 
and (30), with the latter two distinguishing dependModel’s distribution from indepModel’s. 
The remainder of this paragraph (i.e. language interpretation guide entry) is essentially 
the same as in the description of Simplifying Assumption (9). This and Simplifying 
Assumption (12) are the weakest of the axioms with respect to the standards of accuracy 
that I strive for. Unlike the others, we cannot seriously claim that this axiom is literally 
true with respect to the informal intended semantics, simply because the authors of the 
British study did not methodically randomize the way that they chose their sample sets 
of men with and without lung cancer. I would be satisfied to have an axiom that says 
outcomeDistr (dependModel, •) is “close enough” to a binomial distribution, but I have not 
yet investigated suitable ways of formalizing “close enough,” and it is not clear that there 
would be a benefit in pedagogy or cogency that warrants the added complexity. 

Vs: Study Outcomes. 

outcomeDistr(dependModel, s) = condBinom(pg d |Lc,Ps d |LC> s ) 

Simplifying Assumption 12 (indepModel posits a binomial distribution). 

V s : StudyOutcomes. 

outcomeDistr(indepModel, s) = condBinom(pg.|*,pg.|*, s) 

The next two axioms bound the frequencies mentioned in the previous two axioms. The 
description of Axiom (25) in the previous section has some motivation that applies here 
as well. 

Assumption 29. 

- • Pr^A-mp(x e S^ amp I X e LC^ amp ) < 1 - p Sd|E c < 2 • Preamp(z e S^ mp | x e LC^ mp ) 
2 

Assumption 30. 

\ ' 6 s “ mP 1 x 8 LC “ nP) 551 “ Ps ' |LC s 2 ' 6 s ? mp i ^ 8 LC ? np ) 
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The next two axioms tell us how to compute the distributions 
Claim 14. For n = |LC^ mp | (and recall |LCg mp | = |LC^ mp |) have 

condBinom(pi ,p 2 ,a) 

equals 

binDistr pin (a) • binDistr P2;n (|S^ mp | — a) 

max(StudyOutcomes) 

^ binDistr pin (x) • binDistr P2;n (|Sg mp | — x) 

£=min(StudyOutcomes) 

Proof. This is a standard symbolic expression for the conditional binomial distribution. 
Note that we could alternatively have made condBinom(-, •, •) a defined function symbol. 

□ 

Claim 15. For n = |LC s s amp | (and recall |LC^ mp | = |LC£ mp |) have 

( n ) • (,™,_ ) 
condBinom(p,p, a) = —— . —— 

(| S sam P |) 

Proof. This is a well known fact, but I will provide the proof since it is short. 

Let n = |LC^ mp | and j = |S'5|. Referring to its prose definition, condBinomfp. p. a) equals 
the probability of getting a successes when sampling n times from a distribution with 
success probability p, given that you’ve also sampled n times from another distribution 
with success probability p and the total of the two success counts is j. Equivalently, 
condBinom(p,p, a) is 

Pr(get a out of n successes from first distribution and j — a out of n from second) 

Pr(get total of j successes out of 2 n) 

Because the two distributions are sampled from independently, that equals 

Pr(get a out of n successes from first) Pr(get j — a out of n from second) 

Pr (get total of j successes out of 2 n) 

i.e. 

binDistr P]n (a) binDistr P;n (j — a) 
binDistr P; 2 n(j) 

Expanding the definition of binDistr that is: 

(>°(i - pr- (i - G) (A) 

( 2 ”)p(i -p? n - s (f) 
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□ 

The following Conjecture should be easier to prove than Conjecture 1, but it is still a 
tough (for a non-specialist) constrained nonlinear optimization problem. 

Conjecture 2. 

For k in {0,1, 2} have testfc(dependModel) > 2000 • testfc(indepModel) 

Informal argument in support: 

The independent variables model has no parameters, so test^indepModel) is a constant 
for each k e {0,1, 2}. For each k e {0,1, 2}, viewing the 3D plot of test^dependModel) 
as a function of the parameters Ps d |LC and Ps d |LC (pictured below), if we make the very 
plausible assumption that there are no sharp extrema missed by the plot, then it is clear 
that within the range allowed by Axioms (29) and (30), the function is minimized at 
one of the corner points. In fact each of the three functions is minimized when Ps d |LC is 
minimal and p Srf |Ec i s maximal. This is because the maximum likelihood model for the 
American data slightly overestimates the correlation between smoking and lung cancer 
in the British data. The minimums tell us that for any setting of the parameters of the 
dependModel that obeys the axioms, for test 0 (•), testi(•), and test 2 (-) it gives probability 
more than 12,000,5,000, and 2,000 times higher, respectively, than indepModel. 


Figure 7.1: Optimization problems for k = 1,2,3 






Chapter 8 


Example: Assisted suicide should be 
legalized in Canada 


This is the most complex and ambitious of all the examples in this thesis. Not all of the 
many essentially-boolean-algebra lemmas have been formally verified using a first-order 
theorem prover, although any one of them can be checked easily by hand. Those remaining 
lemmas are stated as Claims. 

This argument is meant to be read in a browser, and can be found at: 

http://www.cs.toronto.edu/~wehr/thesis/assisted_suicide_msfol.html 

I include an inferior static version here just in case you have a printed copy and you 
strongly prefer to read on paper. 
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Throughout this argument, S, NS, AS, PAS abbreviate 'suicide 1 , 'no suicide', 'assisted 
suicide', 'physician-assisted suicide' 

This is an argument for favouring the introduction of legislation that founds an 
administrative system (here broadly-defined) for granting Canadians access to 
physician-assisted suicide in limited cases. Before getting into the specifics, I want 
to discuss some important differences between this argument and (purely) natural 
language, non-deductive arguments also in favour of physician-assisted suicide 
(PAS). 

The conclusion of this argument is weaker than the conclusions of the non-deductive 
arguments. In particular, the advocated (underspecified) PAS system is under- 
inclusive: it would not grant access to all people who I (or almost any proponent of 
physician-assisted suicide) think should have access. I am prepared to argue, 
informally, that this —and by "this" I mean under-inclusiveness in general, not the 
particular kind/degree of under-inclusive that the system involved here has- that this 
is a necessary feature of any sufficiently-rigorous argument that does not make 
stronger assumptions than this one. More specifically, for any argument 

• whose assumptions do not preclude the possibility of regrettable uses of 
physician-assisted suicide (see regrettable PAS>), and 

• that does not make strong assumptions along the lines of "regrettable assisted 
suicides aren't that bad", and 

• whose assumptions do not guarantee that the advocated administrative system 
for PAS will have access to additional information about applicants, 

there will be applicants who should, in a subjective moral sense according to 
proponents like me, have access to PAS, but who are indistinguishable from 
(hypothetical) applicants whose assisted suicide would be regrettable. 

Even so, one could argue that this system is especially or unnecessarily under- 
inclusive, and that I won't dispute. This argument should be taken as a proof-of- 
concept. A more-serious attempt would have to involve much more research about 
this particular issue (e.g. data from the history of the systems in places like the 
Netherlands and Oregon) than I can afford to do for my thesis. A recent source of 
such information and much more is Quebec's Committee report on dying with 
dignity . 

At this point I could enumerate the ways that existing non-deductive/natural 
language arguments use stronger assumptions than those employed here. But here is 
a method that you can use to collect most of those ways yourself: for each 
Assumption and Simplifying Assumption, consider how one could use natural 
language to hide the details of the problem that the assumption addresses. For 
example, I will spend a surprising amount of time delineating the things that I 
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consider possibly-pertinent to the decision of whether or not to support the 
legislation. I have never seen this done in such a clear and explicit way in informal 
arguments; often the best one can hope for is that it can be unambiguously gleaned 
from reading nearly the entire text of the argument (which is book-length in some 
cases, e.g. the committee report linked to in the previous paragraph). Another 
example is the unusually-specific weighing of the (sometimes theoretical) pros and 
cons of such legislation; in this argument I go a level deeper than any natural 
language argument I have seen in the justification for the main required subjective 
proposition /Lemma 74. 

There may already exist a natural language argument about PAS that is every bit as 
fair and disciplined as the one I give here (the best concise argument I've read was 
written by the supreme court justices in the minority opinion for the Sue Rodriguez 
case). A similar thing can be said for any proposed prescription or standard; always 
there are actors in the scope of the prescription that have come to comply with it on 
their own. In the case of maximally fair/disciplined argumentation, like in the case of 
prescriptions for limiting pollution, or the protection of human rights, etc, those 
actors should be applauded, but their existence should not detract from the 
importance of the prescription unless they are sufficiently common (where 
"sufficiently common" is relative: we would need them to be very common in the 
case of pollution, or universal in the case of human rights). 

Some features of the administrative system 

First, a person must opt-in before being diagnosed with their terminal illness. The 
purpose of this feature is to lower the (already low) incidence of some forms of 
difficult instances of regrettable assisted suicides (<difficult-regrettable PAS>), 
which are regrettable assisted suicides in which 

1. The applicant is correctly diagnosed; or the applicant is incorrectly diagnosed, 
but nonetheless suffers from an illness that is terminal without treatment, and 
all methods of diagnosis have been exhausted (see <PAS with wrong and 
substandard diagnosis>). 

2. The person who uses the lethal drugs is the person who applied for them (see 
<PAS of wrong persorn). 

3. The applicant uses the drugs with the intention of dying. 

4. The applicant was not actively coerced into applying for and/or taking the 
drugs (see <coerced suicide>) . 

Second, a person must take the drugs in the presence of their physician. This is not 
ideal, and is not required in Oregon, for example, but it is useful for justifying the 
assumption that with very high probability, the negations of items 2 and 3 above 
never happen. 
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Third, there is a fixed list of eligible terminal diseases, which includes only those for 
which it can be established with "high" probability that a person has the disease and 
will "very likely" not live for more than X months. Each of "high", "very likely" and 
X can differ for different patients; the first two only need to be high enough and the 
third small enough that the patient is unwilling to try to beat the doctors' reported 
odds.® 

Fourth, in some parts of Canada, prosecution for illegal assisted suicide is rare 
relative to its incidence. This must remain the case even after a PAS law is passed. 
This is to justify the assumption that <S, NS because of incr fear of prosecution> 
happens to no one with high probability (in Assumption 2). In particular, a person 
answering a request to assist another person in dying, where the second person is 
ineligible for legal PAS, should not be, or perceive to be, at increased risk of 
prosecution, compared to the risk before the law is passed. 

Fifth, to justify the assumption that <NS, NS & worse palliative care> happens to no 
one with high probability (in Assumption 2), the government will monitor private 
donations towards palliative care and research on palliative care, and if the 
appropriately-adjusted funds per person drops below what it was before the PAS law 
was passed, the government will increase public funding to compensate. 

Sixth, to support all of Assumption 2 through Assumption 6, since all can be broken 
by incompetence, the government will adopt one of the provisions from Quebec's 
bill 52 . which they summarize as follows: 

A commission on end-of-life care is established under the name 
“Commission sur les soins de fin de vie”, as well as rules with respect 
to its composition and operations. The mandate of the Commission is to 
examine all matters relating to end-of-life care and to oversee the 
application of specific requirements relating to medical aid in dying. 

High-level features of the argument 

I refer to the things that the axioms assume are pertinent to comparing the two 
outcomes (legislation passed, or not) as individual-future relations (sort IFR). They 
are essentially just relations on the lives of people (sort People) in a given possible 
future (sort OFuture). 

Classifying the differences of a person's life between two possible 
futures 

We are going to be reasoning about the costs and benefits to people of the proposed 
legislation in terms of some of the qualitative effects, for each person, of "moving" 
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from a possible future without PAS (status-quo future) to one with PAS (assisted 
suicide future). To that end, we categorize people using individual-future- 
difference relation (sort A), which are relations about the features of their lives in a 
given pair of possible futures (sort OFuturePair). Usually, a A is determined by a pair 
of IFR's; the people satisfying such a A satisfy the first IFR in the status-quo future 
and the second IFR in the assisted suicide future. Some A's can't easily be 
represented as a product of two IFR's. An example is <NS, NS & worse palliative 
care>, which means: 

The person's death is not a suicide in the status-quo future or the assisted 
suicide future, and the quality of palliative care available to them is 
worse in the assisted suicide future compared to in the status-quo future. 

The problem is that worse palliative care depends on both of the futures. Such a A 
could be defined by a family of IFR-pairs, but that would appear to require 
quantifying the quality of palliative care, which is difficult to do right (in a way that 
is satisfactory to everyone), and even if done right introduces an unnecessary 
abstract entity into the argument (namely the partially-ordered set of quantities). 

Distributions over possible futures 

Our uncertainty about how the future will turn out, with or without new legislation 
for PAS, is modelled by a Bayesian probability distribution 9) over pairs of possible 
futures, where the first (resp. second) element of a pair is a possible future in which 
the proposed legislation is not (resp. is) passed. To be accurate, the distribution is 
over equivalence classes of pairs of possible futures (sort OFuturePairEC), where two 
pairs are in the same class if they agree on 6(p,-) for every individual future 
difference relation 6 and every person p (see definition of ec) . That is just so that our 
interpretations of 9) can be finite (since we define only finitely-many A's), to 
assuage any discomfort about measures over possible futures. 

Two typical futures 

The language of Bayesian reasoning isn't used in informal debates about PAS, and it 
won't be used much here either. Instead, we will define two "typical" possible 
futures, according to whether or not the legislation is passed. In this introduction I 
will call them ASF (assisted suicide future) and SQF (status-quo future), though in 

the proof they are always paired together as F typical • Assumption 1 is the axiom that 

says the goal <should pass> is implied by a statement about F typical , so it is the 
assumption a critic should reject if they have a problem with passing from Bayesian 
distributions to the typical future pair, or a more-fundamental problem with the use 
of a Bayesian distribution at all. For 6 an individual-future-difference relation, and 

F a pair of possible futures, let #(6, F ) be the number of people p such that 6(p) 
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holds in F (written 6(p, F) in the proof). Let Exp#(S, F ) be the expected value of # 
(6, F ) when F is sampled from 3. Then in the definition of F typical , SQF and ASF 

are partially defined( ®) in such a way that: 

• extremely unlikely A's don't happen to anyone. 

• there are slightly-pessimistic counts (for proponent of PAS), relative to the 
expected values determined by 3), for A's that are favourable for ASF. 

• there are slightly-optimistic counts (for opponents of PAS), relative to the 
expected values determined by 3), for A's that are favourable for SQF. 

Argument in steps 

Most of the main tasks of the argument are as follows: 

1. Semantically define a number of IFR's (i.e. introduce them as primitive 
symbols, with language interpretation guide entries). 

2. State uncontroversial axioms about relations between the IFR's. 

3. Syntactically define a number of A's in terms of IFR's, and semantically define 
a few more. 

4. Derive some statements relating the A's (using relations between IFR's), and 
assume some others as uncontroversial axioms. 

5. Define the pair of typical futures F typical = <SQF, ASF> as discussed above. 

6. Syntactically define sets of people Pi,...,Pn in terms of satisfying various A's 
with respect to F typical • 

7. Derive some subset, disjointness and emptiness relations between the PjS 
(from relations between the A's). Also make assumptions, some of them 
controversial, about relations between some of the other PjS. 

8. Make subjective, controversial assumptions of this form: "The change in 
moving from SQF to ASF in the quality-of-life for the people in Pj (considered 
as a group- not individually) is worse/better/approximately the same as the 
change for the people in P j ." 

9. State easily-agreed-upon axioms that suffice to derive, from the other axioms 
and lemmas, that the change in moving from SQF to ASF in the quality of life 
for the set of <all people> is positive overall. 

Note that the form of the final conclusion in item 9 is the same as the form of the 
subjective assumptions in item 8. This may lead one to wonder what we gain from 
all this work. Here is part of the answer: the item 8 (and item 7) assumptions are 
much more specific than the item 9 conclusion, and because of this you learn a lot 
more about my opinions than you would from my simply asserting the item 9 
conclusion. That is a good thing on its own, but it also makes the task of criticizing 
my opinions much more feasible. 


Hide reserved variable declarations 


Variables ((), <j>i, ((> 2 , cj> 3 , ip, ipi, ip 2 are reserved for sort IFR. 
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Variables 6, 5 i, 6 2 , 63, 64 are reserved for sort A. 

Variable p is reserved for sort People. 

Variables F, Fi, F 2 , F 3 , F 4 are reserved for sort OFuture. 

Variables F ,F\,F 2 are reserved for sort OFuturePair. 

Variables E, Ei, E 2 are reserved for sort OFuturePairEC. 

Variables P, Q, Pi, P 2 , P 3 , P 4 are reserved for sort FinSet(People). 
Variables Y, Yi, Y 2 , Y 3 are reserved for sort MultiSet(A). 


Hide sort operators I 

Sort op FinSet - Finite subsets of the given sort. 

Sort op Multiset - Finite multi-sets (unordered lists) of the given sort. 

Sort op Distr B 

If the (interpretation of the) given sort S is finite, then it is the set of 
probability distributions over S. When S is not finite, you may define 
Distr(S) however you like (e.g. arbitrarily make it a singleton set whose 
member is not in the interpretation of any other sort symbol.) 

Show number predicate/function symbols 
Hide sorts 

Sort People B 

The set of residents of Canada. For concreteness, let's say the residents who 
are alive sometime during 1993 (the year of Sue Rodriguez's supreme court 
hearing) or later. 

Sort OFuture B 

Conceivable possible futures. This set is loosely defined. The main 
requirement, implicitly imposed by RelApp IFR and RelApp^, is that each 
element is defined precisely enough that any IFR relation can be evaluated 
on it (and similarly for A relations). 

Sort IFR B 

Individual-Future Relation. An element is a relation on People x OFuture. 
Sort OFuturePair - This is OFuture x OFuture. 

Sort AB 

Individual-future-difference relation. An element is just a relation on People 
x OFuturePair = People x OFuture x OFuture, which is accessed via RelApp^. 

Sort OFuturePairEC B 

Equivalence classes of OFuture pairs. Effectively defined by ec and rep. 

Sort FinSet(People) 

Sort MultiSet(Delta) 

Sort Distr(FPairEC) 

<should pass> : -> B B 

True if you think that legislation should be passed that introduces an 
administrative system for assisted suicide that is compliant with the description 








Chapter 8. Example: Assisted suicide should be legalized in Canada 95 


at the top of this argument. 

Show definitions for pairing: -[1], -[2] 

Show definitions of e, 0 | 

RelAppjp R : IFR x People x OFuture —> B El 

When (|) is an IFR relation, p is from People, and F is from OFuture, then 
RelAppjp R (4>, p, F) means (j) is true for p in F. The second order syntax (j)(p, F) 
is used to display RelAppjpp(((), p, F). 

RelApp A : A x People x OFuturePair BB 

When 6 is a A relation, p is from People, and F is from OFuturePair, then 
RelApp A ( 6 , p, F ) means 6 is true for p in F . The second order syntax S(p, F ) is 
used to display RelApp A ( 6 , p, F). 

Hide set/relation equality axioms 

Definitional Axiom 1: VPi,P 2 . Pi = P 2 <=> (Vp. p G Pi «> p G P 2 ) 

Definitional Axiom 2: Vcj) 1 ,cj) 2 . cj)i = c () 2 <=> (Vp. VF. <t»i(p, F) <=> c|) 2 (p, F)) 
Definitional Axiom 3: V 6 i, 6 2 . 61 = S 2 <=> (Vp,F . 6 i(p, F) o 6 2 (p, F)) 

Aof : People x OFuturePair —> A El 

6 = Aof(p,F) is the unique most-specific/smallest 6 : A such that d(p,F). For a 

given OFuturePair F , the differences (that you think are sufficiently-relevant to 
this debate) between a person p's experiences in the two futures -directed 

differences, the change from the first element of F to the second element of F - 
are given by Aof(p,F). 

AsOf : FinSet(People) x OFuturePair —> MultiSet(A) El 

If P is a set of People, then AsOf(P,F) is the multiset of the same size IPI 
obtained by applying AsOf(-,F) to P. 

> A : MultiSet(A) x MultiSet(A) —> B E) 

A subjective comparison relation. If Yi > A Y 2 then you would rather have the 
life-experience changes Yi happen to I Yd random people than have the life- 
experience changes Y 2 happen to IY 2 I random people. Is a strict partial order. 

~ : MultiSet(A) x MultiSet(A) -> BE) 

A subjective equivalence relation. If Yi ~ Y 2 then you are impartial or cannot 
decide whether you would rather have the life-experience changes Y 1 happen to 
IY il random people or the life-experience changes Y 2 happen to IY 2 I random 
people. 

Defn : MultiSet(A) x MultiSet(A) -* B - VYi,Y 2 . Y x Y 2 ^ (Y 1 > A Y 2 v Y x 
kY2) 

Defn~ ec : OFuturePair -» OFuturePairEC - VFi ,p 2 - ec(Fi) = ec(F 2 ) (Vp, 6 . 6 (p, 

Fi) <=> 6 (p, F 2 )) - Equivalence class of the given OFuturePair. See description of 
OFuturePairEC. 
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Defn~ rep : OFuturePairEC -> OFuturePair - VE. ec(rep(E)) = E0 
A representative for the given equivalence class. rep(E) can be any 
A : OFuturePair such that ec(A) = E. 

Defn swap : OFuturePair -* OFuturePair - VA . swap(A) = (A [2], A [1]> 

Defn- pair IFR : IFR x IFR -» A - Vp,A 4 ),i|l pair IFR ((|), tf)(p, A) <=> (cj)(p, A [1]) a 
oKP, F [2])) B 

The A relation obtained in the natural way from two IFR relations; the first (resp. 
second) IFR relation is applied to the first (resp. second) element of the given 
OFuturePair. Most elements of A can be defined in the way. An example of an 
exception is <S, NS & worse palliative care>, because "worse palliative care" 
depends on both elements of the OFuturePair. Such a A relation could be defined 
by a family of IFR-pairs, although that would appear to require quantifying the 
quality of palliative care, which, if done right (in a way that is satisfactory to 
everyone), introduces an unnecessary abstract entity into the argument, namely 
the partially-ordered set of quantities. 

Defn- peopleInjp R : IFR x OFuture -* FinSet(People) - Vc|),F,p. p €E peopleInjp R ((f), 
F) o <|>(p, F) 

Defn- peopleln^ : A x OFuturePair -» FinSet(People) - VS, A ,p. p E peopleIn A (6, 
A) 8(p, A) 

Defn- 0 : MultiSet(A) - VA . 0 = AsOf(0, A ) E) 

Empty multiset that corresponds to a 'neutral' judgement; if AsOf(P,A) = 0, then 

for the set of people P, the pair of futures A are, overall, of equal value with 
respect to the A relations in use. 

Defn- <true> : IFR - VF. Vp. <true>(p, F) - Trivial IFR such that <true>(p,F) for all 
P,F- 


Show set function definitions 
Show set predicate definitions 
Show set facts 


Hide IFR constants 

<S> : IFR - The person's death is a suicide. 

Defn <NS> : IFR := co(<S>) - The person's death is not a suicide. 

Defn- <PAS> : IFR - <PAS> C <S> - The person ends their life via physician- 
assisted suicide 

Defn <non-PA suicide> : IFR := <S> \ <PAS> - The person's death is a non- 
physician-assisted suicide. 

Defn- <regrettable PAS> : IFR - regrettable PAS> C <PAS> El 

A person p is in this IFR if they legally use physician-assisted suicide and if 
there is some information about p, which was unknown at the time when 
their application for PAS was approved, that, if it had been known, would 
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have caused a significant proportion (say, 20%) of people who would have 
supported p’s application to resolutely change their mind. Here "resolutely" 
means that no further information about p would again change the minds of 
those 20% of people. A more concise but possibly more vague definition 
would be to say that at least 20% of p’s supporters, if given "perfect 
information" about p, would change their minds. 

Defn <non-regrettable PAS> : IFR := <PAS> \ <regrettable PAS> 

Defn~ <PAS with wrong and substandard diagnosis> : IFR - <PAS with wrong 
and substandard diagnosis> C regrettable PAS> B 

The person is incorrectly diagnosed with a soon-to-be-terminal condition 
when their real condition is presently treatable. For this argument, we will 
only allow PAS for conditions that can be diagnosed with very high 
accuracy. The other option one could take is to argue that a patient can still 
benefit from PAS even when they were wrongly diagnosed, provided at least 
that they had the best medical care available. The point is that, if all available 
evidence has been taken into account, including the probabilities of 
misleading and missing evidence, then the individual's decision indirectly 
expresses their utilities of living with suffering vs dying unnecessarily, and 
in that case we should use their utilities as opposed to an average. 

Defn~ <coerced suicide> : IFR - <coerced suicide> C <S> B 

The person was actively coerced into suicide (assisted or otherwise) by 
another person, where "actively coerced" means influenced by some means 
other than guilt, a sense of duty, etc. This is another logical possibility that is 
easy to "deal with" in this argument (where "deal with" does not mean 
prevent - see Assumption 4). 

Defn <non-coerced suicide> : IFR := <S> \ <coerced suicide> 

Defn~ <PAS of wrong person> : IFR - <PAS of wrong person> C regrettable 
PAS> B 

This could theoretically happen in Oregon, for example, since the prescribed 
lethal drugs don't need to be taken in the presence of a physician. To simplify 
this argument, we posit a stricter system in which the prescribed drugs must 
be taken in the presence of a physician. 

Defn <coerced PAS> : IFR := <coerced suicide> n <PAS> 

Definitional Axiom 4: <coerced PAS> C regrettable PAS> 

Defn <very bad PAS> : IFR := <PAS of wrong person> U (<coerced PAS> U <PAS 
with wrong and substandard diagnosis>) B 

Union of some relations that, when true, constitute major failures of the 
safeguards of the assisted suicide system. 

Defn <non-coerced PAS> : IFR := <PAS> \ <coerced PAS> 

Defn <very bad non-coerced PAS> : IFR := <very bad PAS> n <non-coerced PAS> 
Defn regrettable non-coerced PAS> : IFR := regrettable PAS> n <non-coerced 
PAS> 

Defn <non-regrettable non-coerced PAS> : IFR := <non-regrettable PAS> n <non- 
coerced PAS> 
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Defn <non-coerced non-PA suicide> : IFR := <non-coerced suicide> n <non-PA 
suicide> 

Defn <difficult-regrettable PAS> : IFR := <regrettable PAS> \ <very bad PAS> B 
The kinds of hypothetical regrettable uses of PAS that are hardest to prevent. 
Defn <difficult-regrettable non-coerced PAS> : IFR := <difficult-regrettable PAS> 
fi <non-coerced PAS> 

<NS & worst-case denial of PAS> : IFRS 

These are the people who have a genuine self-centered desire for PAS and 
suffer most from not having the option of PAS. Includes at least each person 
who has an (eventually) physically and/or mentally painful condition and: 

• Will not be physically capable of committing suicide without help, by 
the time they wish to. E.g. if their medical condition severely restricts 
their movement. And: 

Doesn't know anyone who would be willing to assist them, or does 
know such a person but is unwilling to put them at risk of prosecution, 
or is unwilling to break the law. 

Or: 

• Would be physically capable, but 

o Can't afford to travel to a place where PAS is legal. 

And: 

o Is unwilling to end their life in a way that disfigures their body 
(e.g. for consideration of the person who finds them, or their 
family, or religious reasons). 

<NS & typical-case denial of PAS> : IFRS 

These are the people who have a genuine self-centered desire for PAS and 
don't suffer greatly from not having the option of PAS (i.e. not in <NS & 
worst-case denial of PAS>), but whose quality of life is still negatively 
affected. 

<NS & best-case denial of PAS> : IFRS 

These are the (possibly empty) set of people who have a genuine self- 
centered desire for PAS but nonetheless benefit, overall, from not having the 
option of PAS. 

Defn~ <NS & desire for PAS> : IFR - DisjointUnion 3 (<NS & desire for PAS>, 
<NS & worst-case denial of PAS>, <NS & typical-case denial of PAS>, <NS & 
best-case denial of PAS>) a <NS & desire for PAS> C <NS> - Person wanted PAS 
but didn't receive it (for any reason). 

Defn <NS & no desire for PAS> : IFR := <NS> \ <NS & desire for PAS> 


Hide A constants 

Show trivial A definitions 
<true, NS & worse palliative care> : A B 

A genuine worry for some opponents of PAS is that a PAS policy could lead 
to decreased funding for palliative care. There are two ways one can address 
this concern: (1) use data from jurisdictions that have PAS to argue that it 
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hasn't happened there, and postulate that Canada will be the same. (2) take 
extra, active measures to defend against this happening. I have only touched 
on (2); see note about this in introduction (starts "Fifth, to justify the 
assumption...") 

Defn <S, NS & worse palliative care> : A := <true, NS & worse palliative care> n 
<S,NS> 

Defn <NS, NS & worse palliative care> : A := <true, NS & worse palliative care> 
fl <NS,NS>B 

The person's death is not a suicide in the status-quo future or the assisted 
suicide future, and the quality of palliative care available to them is worse in 
the assisted suicide future compared to in the status-quo future. 

Defn~ <S, NS because of incr fear of prosecution> : A - <S, NS because of incr 
fear of prosecution> C <S,NS> B 

There is a logical possibility that a person will choose to end their life via 
illegal non-physician-assisted suicide if legal PAS is not an option for 
anyone, but will choose not to if legal PAS is an option for some people but 
not for them. If the difference is because they have an increased fear that 
whoever helps them to die will be prosecuted, then they would be worse off 
in the assisted-suicide future (with a new law passed) than they would be in 
the status-quo future. We will specify that any PAS law that this argument 
endorses does not lead to increased prosecution for illegal assisted suicide. 
<NS, NS because of incr fear of prosecution> : A 

Definitional Axiom 5: <NS, NS because of incr fear of prosecution> C <NS,NS> 
Defn <true, NS because of incr fear of prosecution> : A := <NS, NS because of 
incr fear of prosecution> U <S, NS because of incr fear of prosecution> 
Definitional Axiom 6: Disjoint(<NS, NS because of incr fear of prosecution> , <S, 
NS because of incr fear of prosecution>) 

Defn <NS, NS & same-or-better palliative care> :A:=<NS,NS>\<NS,NS& 
worse palliative care> 

Defn <S, NS & same-or-better palliative care & no incr fear of prosecution> : A 
:= (<S,NS> \ <S, NS because of incr fear of prosecution>) \ <S, NS & worse 
palliative care> 

<NS, NS & same-or-better palliative care & exacerbated guilt> : A B 
A genuine worry for some opponents of assisted suicide is that, with 
physician-assisted suicide available, we will see a decrease in the quality of 
life of people who are eligible for it but do not want it, due to, for example, 
feelings of being a burden on their family and friends. We do not try to 
design an assisted suicide system that prevents this; instead we acknowledge 
it and postulate that it is compensated for by the benefits of an assisted 
suicide system. 

<NS, NS & same-or-better palliative care & no exacerbated guilt> : A 
Definitional Axiom 7: DisjointUnion(<NS, NS & same-or-better palliative care>, 
<NS, NS & same-or-better palliative care & no exacerbated guilt>, <NS, NS & 
same-or-better palliative care & exacerbated guilt>) 
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Defn EasierCases : A := (<true, NS & worse palliative care> u <trae, NS because 
of incr fear of prosecution>) U (pairjp R (<true>, <very bad PAS>) U <S,S>) 

Defn <NS, coerced suicide> : A := pairing <NS>, <coerced suicide>) \ EasierCases 
Defn <coerced suicide, NS> : A := pairip R (<coerced suicide>, <NS>) \ EasierCases 
Defn <NS, non-PA suicide & non-coerced> : A := pairjp R (<NS>, <non-PA suicide> 
fl <non-coerced suicide>) \ EasierCases 

Defn <non-PA suicide & non-coerced, NS> : A := pairjp R (<non-PA suicide> fl 
<non-coerced suicide>, <NS>) \ EasierCases 

Defn <NS, difficult-regrettable PAS> : A := pahjp R (<NS>, <difficult-regrettable 
PAS>) \ EasierCases 

Defn <NS & typical-case denial of PAS, non-regrettable PAS> : A := pahjp R (<NS 

& typical-case denial of PAS>, <non-regrettable PAS>) \ EasierCases 

Defn <NS & worst-case denial of PAS, non-regrettable PAS> : A := pairjp R (<NS 

& worst-case denial of PAS>, <non-regrettable PAS>) \ EasierCases 

Defn <S, NS & no incr fear of prosecution> : A := <S,NS> \ <S, NS because of incr 

fear of prosecution> 

Defn <S, NS & worse palliative care & no incr fear of prosecution> : A := <S, NS 
& worse palliative care> n <S, NS & no incr fear of prosecution> 


Note that, at the current level of detail of this proof, the next two symbols are only 
used in the semantic descriptions of other symbols - not in any axioms. 

3) : Distr(OFuturePairEC) - A finite, subjective Bayesian probability distribution 
over OFuturePairEC 

Exp# : Distr(OFuturePairEC) x A —* B 

Exp#(D,6) is the expected value of lpeopleIn^(6,rep(E))l when OFuturePairEC E 
is chosen at random from D 

Defn~ favourable for status quo> : A —^ EB - V6. <favourable for status quo (6) 0 

> A AsOf(peopleIn^(6 , Atypical )» F typical) ® 

<favourable for status quo>(6) means that 6(-, Atypical ) represents a negative 

change (or equivalently, 6(-, swap(Atypical )) represents a positive change). This is 
more easily conveyed by examples: favourable for status quo>(<NS, coerced 

suicide>) holds because if <NS, coerced suicide>(p, A typ i C ai ) then clearly 
something gets better for p when moving from the assisted suicide future (the 
second of the pair) to the status quo future (or worse when moving in the 
opposite direction). 

Atypical : OFuturePairB 

A particular future pair whose equivalence class is "typical", in a certain sense, 
with respect to At a high level, we assume 
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• Extremely unlikely (w.r.t. 2>) As don't happen to anyone in Atypical • 

• The count of people who 6 happens to is roughly the expected value of the 
corresponding counting random variable of 2). We round up for the As 
that are favourable for the status-quo (SQ) futures, and down for the As 
that are neutral or favourable for the assisted suicide futures. 

More precisely, there is a constant e:R-°, which is a parameter of this argument, 
such that: 

• lpeopleIn^(6, Atypical)! = 0 for all 6 such that Exp#(6,S>) < e. 

• lpeopleIn^(6,A typical)! = ceil(x) for all 6 such that Exp#(6,2>) = x > e and 
favourable for status quo>(6). 

• lpeopleIn^(6,A typical)! = floor(x) for all 6 such that Exp#(6,Ss) = x > e and 
-■favourable for status quo>(6). 

Defn~ <all people> : FinSet(People) - Vp. p G <all people> 

Assumption 1: AsOf(<all people>, Atypical) >a 0 => <should pass> 

0 Goal: <should pass> 

Defn~ peopleln^ : IFR -> FinSet(People) - Vcj). Vp. p G peopleln^ (cj)) <=> 

4KP» F typical [1]) 

Defn~ peopleln^p : IFR -> FinSet(People) - V(j). Vp. p G peopleln^ (cf)) <=> 
4KP • A typical [2]) 

Defn peopleIn A : A -> FinSet(People) - V6. peopleIn A (6) = peopleIn A (6, 

A typical ) 

Hide > A axioms 

Show "« is an equivalence relation" 

Show "> A is a strict partial order." 

Definitional Axiom 14: VPi,P 2 ,A . Disjoint(Pi, P 2 ) a AsOf(Pi, A) = 0 a 
A sOf(P 2 , A) « 0 =» AsOf(Pi U P 2 , A) ^ 0 

Definitional Axiom 15: VPi,P 2 . VA . Disjoint(Pi, P 2 ) a AsOf(Pi, A) & A 0 a 
A sOf(P 2 , A) 0 => AsOf(Pi U P 2 , A) s A 0 

Definitional Axiom 16: VPi,P 2 . VA . Disjoint(Pi, P 2 ) a AsOf(Pi, A) & A 0 a 
A sOf(P 2 , A) > A 0 => AsOf(Pi U P 2 , A ) > A 0 

Definitional Axiom 17: VPi,P 2 . VA . Disjoint(Pi, P 2 ) a AsOf(Pi, A) ~ 
AsOf(P 2 , swap(A)) => AsOf(Pi U P 2 , A) « 0 

Definitional Axiom 18: VPi,P 2 . VA . Disjoint(Pi, P 2 ) a AsOf(Pi, A) s= A 
AsOf(P 2 , swap(A)) a AsOf(Pi, A) 0 => AsOf(Pi u P 2 , A) & A 0 
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Definitional Axiom 19: VPi,P 2 . VF . Disjoint(Pi, P 2 ) a AsOf(Pi, F) >^ 

AsOf(P 2 , swap(F)) a AsOf(Pi, F) 0 => AsOf(Pi U P 2 , F) 0 
Assumption 2: For each 6 in {pairjppQtruo , <PAS of wrong person>), 
pahj FR (<true>, <PAS with wrong and substandard diagnosis>), <S, NS because of 
incr fear of prosecution> , <NS, NS & worse palliative care>}: peopleIn^(6) = 0 

E 

The proposed safeguards for the implementation make these As so unlikely 
that they do not apply to anyone in F typ i ca i • 

Assumption 3: For each 6 in {pairjp R (<NS & no desire for PAS>, <non- 
regrettable PAS>), <NS, NS & same-or-better palliative care & no exacerbated 
guilt>, <S,S>, <S, NS & same-or-better palliative care & no incr fear of 

prosecution>}: AsOf(peopleIn^(6), F ty picai) ~a0® 

The quality of life for these people is not significantly different between the 
two futures. We make the superficially-weaker assumption that their quality 
of life is not worse in the assisted-suicide future than it is in the status quo 
future. 

Assumption 4: AsOf(peopleIn^(<NS, coerced suicide>), F ty picai) ~ 
AsOf(peopleIn^(<coerced suicide, NS>), swap(F typ i C ai))^ 

Let Pi be the people whose death is not a suicide in SQF and whose death is 
a coerced suicide in ASF. Let P 2 be the people whose death is a coerced 
suicide in SQF and not a suicide in ASF. Note that Pi and P 2 are disjoint. 
When it comes to comparing SQF and ASF, the cost for the people in P x of 
moving from SQF to ASF is approximately equal to the cost for the people in 
P 2 of moving from ASF to SQF. In a more-detailed version of this argument, 
this assumption would be derived (by a single axiom) from two simpler ones: 

• The size of Pi is approximately equal to the size of P 2 . This requires 
that passing of PAS legislation eliminates as many coerced suicides as 
it introduces (P 2 are eliminated, Pi are introduced). 

• You believe coerced suicide in ASF and coerced suicide in SQF are 
equally-bad fates. 

Assumption 5: AsOf(peopleIn^(<NS & typical-case denial of PAS, non- 

regrettable PAS>), F typ i C ai) ~a AsOf(peopleIn^(<NS, NS & same-or-better 

palliative care & exacerbated guilt>), swap(F tyP icai)) ® 

Let Pi be the people who satisfy <NS & typical-case denial of PAS> in SQF 
and <non-regrettable PAS> in SQF. Let P 2 be the people whose death is not a 
suicide in SQF who, in ASF, still do not die by suicide, and have equally - 
good access to palliative care, but also are tormented with guilt for not 
choosing PAS (and there was no such guilt for them SQF, because PAS was 
illegal). When it comes to comparing SQF and ASF, the cost to the people in 
Pi of moving from ASF to SQF is at least as great as the cost to the people in 
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P 2 of moving from SQF to ASF. In a more-detailed version of this argument, 
this assumption would be derived (by a single axiom) from two simpler ones: 

• For some fraction n/m > 1 (say, n=10 and m=l), (n/m)IPil < IP 2 I. That 
is, you grant that P 2 is larger than Pi, but you posit that it is no more 
than (n/m) times larger. 

• You believe it is at least as bad to move m people who satisfy <NS & 
typical-case denial of PAS, non-regrettable PAS> from ASF to SQF as 
it is to move n people who satisfy <NS, NS & same-or-better palliative 
care & exacerbated guilt> from SQF to ASF. 

Assumption 6: AsOf(peopleIn^(<NS & worst-case denial of PAS, non- 

regrettable PAS>), Atypical ) >a AsOf(peopleIn^(<NS, difficult-regrettable PAS>), 
swap(F t ypicai)) E1 

This assumption is similar to Assumption 5 in form (except for using 
instead of ^/f), but it concerns more-severe kinds of changes. This 
assumption is often the main explicit disagreement between non-religious 
proponents of PAS and (self-reported) non-religious opponents of PAS. So 
please read the informal description of Assumption 5 first. Like Assumption 
5, in a more-detailed version of this argument, this assumption would be 
derived (by a single axiom) from two simpler ones, which are of the form in 
Assumption 5 except that one might argue the fraction n/m is < 1 if the 
safeguards of the PAS system are very good. 

| Hide definitions of sets of people corresponding to previous 5 assumptions. 

Defn Peoplei : FinSet(People) := peopleIn A (pairjp R (<true> , <PAS with wrong 
and substandard diagnosis>)) U (peopleIn A (pairjp R (<true> , <PAS of wrong 
person>)) U (peopleIn^(<NS, NS & worse palliative care>) U peopleIn A (<S, 
NS because of incr fear of prosecution>))) 

Defn People 2 : FinSet(People) := ((peopleIn^(<S,S>) u peopleIn^(<S, NS & 
same-or-better palliative care & no incr fear of prosecution>)) U 
peopleIn^(<NS, NS & same-or-better palliative care & no exacerbated 
guilt>)) U peopleIn / y(pairyp R (<NS & no desire for PAS>, <non-regrettable 
PAS>)) 

Defn People 3 : FinSet(People) := peopleIn^(<NS, coerced suicide>) u 
peopleInA(<coerced suicide, NS>) 

Defn People 4 : FinSet(People) := peopleIn A (<NS, NS & same-or-better 
palliative care & exacerbated guilt>) u peopleIn A (<NS & typical-case denial 
of PAS, non-regrettable PAS>) 

Defn People 5 : FinSet(People) := peopleIn^(<NS, difficult-regrettable PAS>) 
U peopleIn A (<NS & worst-case denial of PAS, non-regrettable PAS>) 
Assertion 1: AsOf(peopleIn^(<NS & typical-case denial of PAS, non-regrettable 

PAS >), Atypical ) '’A ^ 
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From <NS & typical-case denial of PAS > to <non-regrettable PAS> is a good 
change. Some self-reported religious people, who are not in the intended 
audience of this argument, will reject this assumption. 

Assertion 2: AsOf(peopleIn^(<NS & worst-case denial of PAS, non-regrettable 

PAS>), F typical) 

From <NS & worst-case denial of PAS> to <non-regrettable PAS> is a good 
change. Some religious people, who are not in the intended audience of this 
argument, will reject this assumption. 

Simplifying Assumption 1: peopleln^ (<NS & desire for PAS>) = peopleln 
ifr ONS & typical-case denial of PAS>) U peopleln^ (<NS & worst-case denial 
ofPAS>)S 

From the proposition accompanying the introduction of <NS & desire for 

PAS>, this is equivalent to the assumption that there are no instances of <NS 

& best-case denial of PAS> in SQF. This assumption is disputable. It would 

be better to include the hypothetical set of people who satisfy <NS & best- 

case denial of PAS> in Assumption 5 or Assumption 6 , which would amount 

to a strengthening of those assumptions. 

—► 

B /Lemma 70: AsOf(<all people>, F typical) >a 0 

Defn People^ : FinSet(People) := (((Peoplei u People 2 ) u People 3 ) u 
People 4 ) U People 5 

B/Lemma 71: <all people> = People | _ 5 

E)Lemma4: <all people> C ((peopleInA(<S,S>) U peopleInA(<S,NS>)) U 
peopleIn / y(<NS ,S>)) U peopleIn^(<NS,NS>) 

The four sets on the right hand side are a partition of <all people> 
since <NS> is the complement of <S>. 

E)Lemma 5: peopleln A (<S,S>) C People ^5 

Follows easily from basic set reasoning, Claim 22, and the 
definition of People ^ 5 . 

Claim 22: peopleIn^(<S,S>) C People 2 
El Lemma 6 : peopleln A (<S,NS>) C People] ,5 

Follows easily from basic set reasoning, Claim 23, and the 
definition of People ^ 5 . 

Claim 23: peopleInA(<S,NS>) C Peoplei U People 2 
Claim 24: peopleIn^(<NS,S>) Q People^ 

El Lemma 7: peopleIn A (<NS,NS>) C People [_ 5 

Follows easily from basic set reasoning, Claim 25, and the 
definition of People ^ 5 . 

Claim 25: peopleIn^(<NS,NS>) C (Peoplei u People 2 ) u People 4 
/Lemma 72: People ^.5 C <all people> 
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/Lemma 73: <all people> C People ]_ 5 
B /Lemma 74: AsOf(People^ 5 , F typical) >a 0 
: AsOf(People 5 , Ftypicai) >a 0 

: AsOf(((Peoplei U People 2 ) U People 3 ) U People 4 , Ftypicai) 

~A° 

B /Lemma 77: AsOf(People 4 , Ftypicai) ^a 0 

Claim 26: Disjoint(peopleIn A (<NS, NS & same-or-better 
palliative care & exacerbated guilt>), peopleIn^(<NS & typical- 
case denial of PAS, non-regrettable PAS>)) 

B /Lemma 78: AsOf((Peoplei u People 2 ) u People 3 , Ftypicai) ~ 0 


/Lemma 75 
B /Lemma 76 




Chapter 9 


Ongoing work 

9.1 Web system for collaborative authoring and criti¬ 
cizing of interpreted formal proofs, and a minimal 
dialogue system 

Initially I thought that a sophisticated dialogue system, with rules designed to ensure 
progress under certain assumptions, would be essential to move forward with this project. 
With more experience writing interpreted formal proofs, however, it became clear that 
reasoning faithfully about complicated inelegant structures was already so onerous a task 
that it would be asking too much of authors to require the extra work of demonstrating 
that their argumentative moves make progress. This has led me to shift to a relatively 
simple and lax model of interaction. The end of this chapter contains some notes about 
the issue of progress. 

9.1.1 Related work from Informal Logic 

Carneades: 

Carneades is a web application in active development “which provides software tools 
based on a common computational model of argument graphs useful for policy deliberations 
and claims processing.” [Gorl3] It is the application, of those I am aware of, that is most 
related to the one I am working on, although its focus on propositional defeasible reasoning 
makes it still only weakly related. In more detail, the principle developer Thomas F. 
Gordon describes Carneades as a collaborative, online system for (quoting from [Gorl3]): 

• modeling legal norms and argumentation schemes 

• (re) constructing arguments in an argument graph 
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• visualizing, browsing and navigating argument graphs 

• critically evaluating arguments 

• forming opinions, participating in polls and ranking stakeholders by degrees of 
agreement 

• obtaining clear explanations, using argument graphs, of the differential effects of 
alternative policies or legal theories in particular cases 

So Carneades is a software system with very broad intended applications, but it is not 
misleading to say, more concisely, that it is a tool suite for supporting the practice of 
deliberate defeasible argumentation. 

I tried out Carneades. As of 15 Aug 2014, the program comes with only one example, 
called “Copyright in the Knowledge Economy”. The instructions for the “guided tour” of 
the example (described as an “opinion formation and polling tool”) certainly have a lot in 
common with the goals of this thesis: 

It guides you step by step through the arguments on all sides of a complex policy 
debate, providing you with an overview of the issues, positions and arguments 
in a systematic way. The tool can help you to form your own opinion, if you 
don’t yet have one, or to critically evaluate and reconsider your preexisting 
opinion, if you do. The tool also enables you to compare your responses 
with published positions of some stakeholders, such as the official positions of 
political parties. This can help you to find persons and organizations which 
best represent or share your views and interests. 

However, examining the argument itself one finds that, aside from the tree structure, it 
is not a great departure from typical natural language arguments. Here is a prototypical 
example, where “exceptions” means copyright exceptions: 

Q4. Should certain categories of exceptions be made mandatory to ensure more 
legal certainty and better protection of beneficiaries of exceptions? 

pro Argument #1: The permitted exceptions should be harmonised so that they 
are available in all Member States. 

• pro Argument #1: 

— Performing the action of harmonizing the exceptions and giving 

precedence to community law over contracts would achieve a state in 
which it easier for researchers and students to work in more than 
one Member State. 
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— Harmonizing the copyright exceptions would make it easier for researchers 
and students to work in more than one Member State. 

— Achieving the goal of making it easier for researchers and students 
to work in more than one Member State would promote the values of 
efficiency, legal certainty, scientific research and education. 

— In the circumstances: Researchers and students increasingly work 
in more than one Member State. The patchy availability of exceptions 
makes their work difficult, because what is lawful in one country 
is probably unlawful in another. The situation is made worse by the 
provision of most Member States that contracts, governing the use 
of digital material, automatically overrides statute law. 

• con Argument #2: 

— It is essential that the basic principle of freedom of contract be 
recognized and preserved by any copyright legislation. 

— Harmonizing copyright exceptions would impair the freedom of contract. 

— Impairing the freedom of contract would demote the values of innovation 
and the dissemination of knowledge and information. 

— Currently, the lack of harmonization of copyright exceptions facilitates 
the freedom of contract. 

In the argument graphs approach they take, formal logic is not imposed on arguments. 

Instead, the dialogue features, together with the fundamental notions of an argument 
attacking or defending a proposition, must be used by one arguer to try to make the 
other’s reasoning seem less sound. 

9.1.2 Design of a web system 

This section describes work in progress. 

Interpreted formal proofs are written using a web-based IDE (integrated development 
environment), where the document is tree-structured except for some leaves that contain 
natural language text, which is scanned for symbol ids to insert references. There is 
an auto-complete feature for already-declared symbols. Declarations (axiom, lemma, or 
new symbol introduction) can be tagged, to make groups of declarations, in order to 
more-concisely specify a subset of declarations that should be used to prove a lemma 1 . 

x As I mentioned earlier in Section 2.4, this has so-far been necessary when there are a large number 
of declarations, due to the non-goal-directed nature of the saturation-based first order theorem provers 
that I have used. 
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An author of an interpreted formal proof may grant edit privileges to other users, 
and simultaneous multi-collaborator editing, versioning, and unbounded undo will be 
implemented with the help of a realtime framework , such as the Google Drive Realtime 
API. The vision is that interested people who don’t know formal logic will be able to 
contribute to interpreted formal proofs by writing and improving the many required 
sections of natural language text, both for the language interpretation guide entries and 
for introductions to arguments. 

When a user begins to criticize an author’s interpreted formal proof, a dialogue data 
structure is created. It stores, first of all, the critic’s current critique of each axiom, which 
includes one of the following stances: 

• accept - the critic commits to having only personal interpretations of the current 
language that satisfy the axiom. 

• weakly reject - the critic commits to having some personal interpretations of the 
current language that satisfy the axiom, and some that falsify it. 

• strongly reject - the critic commits to having only personal interpretations of the 
current language that falsify the axiom. 

• semantics criticism - the critic submits at least one symbol used in the axiom whose 
language interpretation guide entry is too vague for them to evaluate the axiom - 
that is, to take one of the previous three stances. 

A critique of any of the latter three categories may have attached to it declarations 
that are owned by the critic (see Section 2.2.1 for details). The dialogue data structure 
also stores responses to those critiques by the author of the interpreted formal proof, 
which may also include new declarations. The author may make changes to the original 
declarations of their interpreted formal proof that are local only to one dialogue. 

The greatest foreseen challenge is when the author wishes to make a change to their 
interpreted formal proof that affects all ongoing dialogues 2 (hereafter: a change to their 
interpreted formal proof’s root document ), especially for changes that are not a direct 
response to a criticism. That includes improvements initiated by the author to the 
wordings of language interpretation guide entries (which will happen very often), as well 
as major structural changes to the proof to fix an earlier-made poor formalization decision 
(i.e. refactoring, as it’s called in software engineering). Such changes can subtly affect 
the meaning of critiques, or in the worst case render current dialogues unintelligible. My 
current best idea to address this is as follows. When an author wishes to make a change 
to a root document that is currently involved in at least one dialogue, they are asked to 


2 Note that dialogues will happen slowly, since the work is being done in users’ free time. 
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make a claim about the change’s effect on each ongoing dialogue, in particular whether 
they expect the change to require certain specific kinds of adaptations by the critics of 
the current state of their criticism. Fortunately, it should not ever be necessary to make 
adaptations to the entire dialogue, since the author’s change to the root document can be 
recorded in each dialogue and viewed in the same way as a change that was initiated as a 
response to a critique. 

9.2 Obstacles for this project 

The true reason for this straying from the portal of knowledge 3 is, I believe, 
that principles usually seem dry and not very attractive and are therefore 
dismissed with a mere taste. 

(Gottfried Leibniz, 1679, On the General Characteristic [LL76]) 

I have come to believe that there is only one significant obstacle against this project 
gaining interest, which is the difficulty of writing and reading interpreted formal proofs. 
The proofs in this paper are tedious to read, even in HTML, and they were tedious to 
write as well. I do not have a perfect solution in mind for this problem. My current 
approach is to eliminate barriers to entry for use of the software system (e.g. programming 
experience shouldn’t be necessary to author or criticize an argument, and one should be 
able to get started immediately, on any operating system, without installation), and to 
minimize editing friction as much as possible (e.g. type-aware autosuggest, as found in 
modern IDEs for strongly-typed programming languages). 

For a long time I have been concerned about the practical effect of authors and critics 
who do not argue “in good faith”, according to the ideals set out in Section 2.2. I no longer 
believe that uncooperative authors will be a significant concern. Due to the fundamental 
difficulty of writing an argument as a computer-readable formal deductive proof (even with 
an excellent user interface), I expect to have interest only from authors who will want to 
write their arguments so that they are as strong as possible according to the critics whose 
reasoning they respect the most. In contrast to authoring a new interpreted formal proof, 
writing a simple criticism of an existing proof (Section 2.2.1) is by design not difficult, so 
there is greater potential for frivolous criticisms and other time-wasting uncooperative 
behavior. However, formal logic provides a natural notion of progress by which most 
earnest criticisms will “make progress”, namely the proof-theoretic strengthening of a 
system of axioms, and the growing proof-theoretic independence relations that result from 


3 Leibniz is referring to his imagined characteristica universalis , or any similar project. 
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a critic weakly-rejecting an axiom, or an author weakly-rejecting an axiom proposed by 
a critic (recall this means a commitment to the independence of the axiom, which has 
consequences of independence of other sentences). Of course, when an author’s intended 
interpretations of their language are infinite structures, this notion of progress is not 
necessarily terminating, even when the language is never non-conservatively extended, 
since for non-propositional first-order languages £, there are infinite families of ^-theories 
b c r 2 c such that F l+ i is strictly stronger than T* for alH. I do not expect that 
technical possibility to arise in practice except when it is accompanied by non-conservative 
extensions of the language. There is a correspondence between the non-conservative 
extensions case and the familiar experience in informal argumentation where a dialogue is 
never ending due to the scope of the argument being repeatedly expanded. The theoretical 
issue is that it may be difficult to distinguish by any practical technical test between 
those non-conservative language extensions that are necessary to state a criticism, and 
those that are merely stalling. Whether this occurs often in practice, and if so how it 
effects the goals of the project, remains to be seen. 
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